QidongHuang comments

Results 8 comments of


                                            QidongHuang

欢迎分享CVPR 2024 论文和代码 / Welcome to share the paper and code of CVPR 2024

Paper name/title: OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation Paper link: https://arxiv.org/abs/2311.17911 Code link: https://github.com/shikiw/OPERA

欢迎分享CVPR 2023 论文和代码 / Welcome to share the paper and code of CVPR 2023

Paper name/title: Diversity-Aware Meta Visual Prompting Paper link: https://arxiv.org/abs/2303.08138 Code link: https://github.com/shikiw/DAM-VP Thanks a lot!

Questions about the IM_START and IM_END tokens

Thanks for your appreciation and sorry for the misunderstanding! These two tokens do not refer to the special token `````` and ``` ```. Actually, they denote the location indexes of...

关于复现论文实验中的DoLa方法的问题

您好！非常感谢您对我们工作的关注！DoLa不需要去写任何额外的代码，同样适配于MLLM中的base LLM部分，你可以参考DoLa的官方仓库https://github.com/voidism/DoLa，在generate函数调用时添加--early-exit-layers等传参即可，注意使用其仓库提供的tranformers或较高版本的transformers（DoLa作者已经把方法提交到较高版本的tranformers仓库中）

可以根据AutoProcessor或者输入的message来确定image_start与image_end参数吗？

您好，感谢您对我们工作的认可！ 1. image_start，image_end不是指special token的位置，例如在你给的例子中，image_start，image_end分别指第一个和最后一个的位置。可以参考https://github.com/shikiw/OPERA/issues/2 2. 可以提供一下报错的位置吗？感谢

可以根据AutoProcessor或者输入的message来确定image_start与image_end参数吗？

您好， 1. response_start是模型回答开始的首个token的位置 2. 你确认一下generate的参数里有没有设置max_length

是否能支持4.37.2 的transformers

你好，我在README里添加了如何在其他版本transformers中使用OPERA的步骤[here](https://github.com/shikiw/OPERA/tree/main?tab=readme-ov-file#note-to-implement-opera-on-other-version-of-transformers-you-can-follow-the-steps-as-the-follows)，你可以参考一下。

是否能支持4.37.2 的transformers

感谢各位的耐心。我会等Qwen3-VL开源之后提供一版适配更高版本transformers的代码。 Thanks for your patience. I will provide a new version to support higher transformers after the open-source of qwen3-vl.