VLMEvalKit
VLMEvalKit copied to clipboard
Why resize images in MMMU for DeepSeek-VL2?
I notice that in evaluation of DeepSeek-VL2 on MMMU, the code resize the first image. But I don't know why. Could you please tell me the reasons?
(codes from "generate_inner" in VLMEvalKit/vlmeval/vlm/deepseek_vl2.py)
The corresponding Pull Request was created by a member from the official DeepSeekVL2 team. We may need to seek help from the PR author @gnobitab
Thank you very much.