InternVideo Question regarding InternVideo2+VideoChat2 results on MVBench

Hello authors,

Table 16 in the InternVideo2 paper reports a score of 60.9 on MVBench, with VideoChat2 + InternVideo2s3-1B + Mistral-7B.

However, the highest score on MVBench leaderboard is 60.4 (VideoChat2_mistral) as of today.

I am just curious -

Why is the 60.9 result not reported to the MVBench leaderboard yet?
Are these two results referring to the same combination of models? If not, what's the difference?

Thanks!

Jun 06 '24 05:06 jpan72

For question 2, is it because

UMT-L + VideoChat2 + Mistral-7B=60.4 --> Result reported as MVBench top-1 InternVideo2 + VideoChat2 + Mistral-7B = 60.9 --> Result reported in InternVideo2 paper

Another question: 3. Could you please share model weights of InternVideo2 during VideoChat2 training? Something like this table https://github.com/OpenGVLab/Ask-Anything/blob/main/video_chat2/README.md#model but the InternVideo2 version instead of the UMT version?

Appreciate it!

Jun 06 '24 06:06 jpan72

Have you got the model weights of InternVideo2 during VideoChat2 training?

Aug 15 '24 06:08 bbkaeul

Hi, the weight of InternVideo2 stage3 can be found here and here

Aug 15 '24 08:08 yinanhe