Mantis
Mantis copied to clipboard
Official code for Paper "Mantis: Multi-Image Instruction Tuning" [TMLR 2024]
Hi, wanna ask ,does mantis used image separator between images sending to LLM? From i can tell, llava doesn't have it and the data used in Mantis doesn't provide a...
Hi, Thank you for your work on this library. I'd like to know if there's any planned support for Idefics3, this model seems to be better than Idefics2, for visual...
I am trying to add Mantis to the supported model list in VLLM or Sglang
Hello! I'm a big fan of your Mantis paper, I really like it! (and thanks for this repo!) I have a simple question, to clarify the reproducibility. In the README...
I am running into the below issue when I train VideoScore: Training model... Parameter Offload: Total persistent parameters: 706800 in 348 params 0%| | 0/576 [00:00
fix tokenizer in processor
hi, nice project, thanks for sharing it! i have been trying to run the classifier fine-tuning code, but i keep getting this error: .... [rank2]: Original Traceback (most recent call...
Hello, thank you so much for you work! I am trying to finetune the mantis model for multi-image question answering. For the time being I just want to check if...
Great work! It's a very impressive and capable multimodal model. I was looking through the model files and noticed an implementation for qwen2_vl_vae. However, I couldn't find any corresponding experimental...