Jingjing Pan
Jingjing Pan
Hello authors, What validation dataset did you use to during the training epochs of stage 1, 2, and 3, respectively? I believe validation accuracy is important to monitor model convergence...
Hello authors, Table 16 in the InternVideo2 paper reports a score of 60.9 on MVBench, with VideoChat2 + InternVideo2s3-1B + Mistral-7B. However, the highest score on MVBench leaderboard is 60.4...
Hello, As a reference, how long did it take for you to train Stage 1 and 2, respectively (with the training data you described in the paper and set in...
Hello, Thank you for the great work! For stage 4 (instruction tuning with HD data), the current code seems to resize/crop image to 224x224: https://github.com/OpenGVLab/Ask-Anything/blob/main/video_chat2/scripts/videochat_mistral/config_7b_hd_stage4.py#L21 https://github.com/OpenGVLab/Ask-Anything/blob/main/video_chat2/dataset/__init__.py#L73 which means it's actually...