InternVideo icon indicating copy to clipboard operation
InternVideo copied to clipboard

Question regarding InternVideo2+VideoChat2 results on MVBench

Open jpan72 opened this issue 1 year ago • 3 comments

Hello authors,

Table 16 in the InternVideo2 paper reports a score of 60.9 on MVBench, with VideoChat2 + InternVideo2s3-1B + Mistral-7B. image

However, the highest score on MVBench leaderboard is 60.4 (VideoChat2_mistral) as of today. image

I am just curious -

  1. Why is the 60.9 result not reported to the MVBench leaderboard yet?
  2. Are these two results referring to the same combination of models? If not, what's the difference?

Thanks!

jpan72 avatar Jun 06 '24 05:06 jpan72

For question 2, is it because

UMT-L + VideoChat2 + Mistral-7B=60.4 --> Result reported as MVBench top-1 InternVideo2 + VideoChat2 + Mistral-7B = 60.9 --> Result reported in InternVideo2 paper

Another question: 3. Could you please share model weights of InternVideo2 during VideoChat2 training? Something like this table https://github.com/OpenGVLab/Ask-Anything/blob/main/video_chat2/README.md#model but the InternVideo2 version instead of the UMT version?

Appreciate it!

jpan72 avatar Jun 06 '24 06:06 jpan72

Have you got the model weights of InternVideo2 during VideoChat2 training?

bbkaeul avatar Aug 15 '24 06:08 bbkaeul

Hi, the weight of InternVideo2 stage3 can be found here and here

yinanhe avatar Aug 15 '24 08:08 yinanhe