Ask-Anything VideoChat2-text basline

Hello,

Thank you for putting out this amazing set of models, datasets and evals! Is it possible to release the code and details for the VideoChat2-text baseline from your paper? I am studying some properties of video understanding and benchmarks and this baseline might be important!

Best, Benno

Jul 19 '24 15:07 BennoKrojer

Good question! In my experiments, I just input the zero image for VideoChat2-text. For example, changing the code in the demo.ipynb

img_list.append(image_emb) # original image
img_list.append(torch.zeros_like(image_emb)) # zero image

Jul 20 '24 01:07 Andy1621

Thank you! I will try that out

Jul 22 '24 19:07 BennoKrojer

@Andy1621 In the paper you say: "“VideoChat2text” denotes the model receiving blank videos and excludes LoRA tuning, relying solely on the LLM’s capacity for responses". Does that mean you ran the stage 2 model and not stage 3? Do you have any more details on whether the rest was the same such as prompting?

Jul 25 '24 14:07 BennoKrojer

I ran both Stage 2 and Stage 3 models in our pipeline, but with zero-out video input, but the results look quite different to the paper unfortunately. We were able to reproduce the normal however up to 1-2% margin.

We would be very grateful if you can share any more details!

Jul 25 '24 18:07 BennoKrojer

@Andy1621 In the paper you say: "“VideoChat2text” denotes the model receiving blank videos and excludes LoRA tuning, relying solely on the LLM’s capacity for responses". Does that mean you ran the stage 2 model and not stage 3? Do you have any more details on whether the rest was the same such as prompting?

Hi! Actually, we run Stage3 model, which is training without LoRA for a fair comparison.

Jul 29 '24 01:07 Andy1621

Interesting, thank you! Do you still have the weights somewhere for this?

Jul 29 '24 15:07 BennoKrojer

Please try this model without LoRA. I just found it from the previous model weights and haven't tested it~

Jul 30 '24 07:07 Andy1621

Hi, we will close this issue.

Feel free to contact us if you have other questions.

Oct 11 '24 07:10 yinanhe