Kingking comments

Results 7 comments of


                                            Kingking

Some questions about subtitles

Did the linear layer and the LLM participate in training together in all three stages?

Some questions about subtitles

Whether subtitles are also involved in training in the third stage？

Looking forward to integrating more mllm, such as instructblip, minigpt4-v2

> For LLaVA-1.6, it uses both base features (336x336 resolution) and higher resolution features. To perform inference similar to 1.5, you only need to use the base features to avoid...

Would you plan to adapt it to qwen2-7B?

> > _No description provided._ > > Maybe you can contribute to this part. All you need to do is add a llava_qwen2.py, the corresponding conv_mode, and a preprocess_qwen2 function...

Qwen2 version

instruct model 使用的template应该是这个对吧

Qwen2 version

为什么建议直接用base model？instruct model的效果不是会更好吗？

请教论文中的几个问题

https://github.com/slei109/PATNet/issues/17#issue-1703104975 您好，打扰一下，请问这里处理deepglobe的问题您解决了吗？我遇到了同样的问题，用作者提供的处理文件最终得到9175张图片而非论文中的5666张