Video-MME icon indicating copy to clipboard operation
Video-MME copied to clipboard

✨✨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

Results 10 Video-MME issues
Sort by recently updated
recently updated
newest added

How can randomness be mitigated during the testing of video-mme? Are there any specific hyperparameter settings for generating responses?

May I ask what question template VideoMME was used during testing. For example, is it something like "Q:{question} \nOption: A: ... \nB: ... \nC: ... \nD:... \nAnswer:"

Can I get the metadata to check which data example is under which question sub_type (e.g., object counting)?

In 119-1, options A and B are same, both being 'The shortest man in the world.'. But the answer is B.

Hi, Will you make open-source/can you share the raw evaluations for proprietary models? Best, Orr

Hello, I see in the paper that default MLLM configs were largely used, but frame counts were increased where applicable. Certain models such as LongVA appear to support video contexts...

https://huggingface.co/allenai/Molmo-7B-D-0924

can you give me the code for drawing radar_chart in your paper?

You mentioned that GPT5 and Gemini 2.5 used Video-MME in their release report showing SOTA performance. Can you add them to the leaderboard please?

Dear Authors, Thank you for your excellent work! I want to submit our best result to VideoMME leaderboard and have sent email to [email protected] four days ago but got no...