Ask-Anything icon indicating copy to clipboard operation
Ask-Anything copied to clipboard

VideoChat2-text basline

Open BennoKrojer opened this issue 1 year ago • 7 comments

Hello,

Thank you for putting out this amazing set of models, datasets and evals! Is it possible to release the code and details for the VideoChat2-text baseline from your paper? I am studying some properties of video understanding and benchmarks and this baseline might be important!

Best, Benno

BennoKrojer avatar Jul 19 '24 15:07 BennoKrojer

Good question! In my experiments, I just input the zero image for VideoChat2-text. For example, changing the code in the demo.ipynb

img_list.append(image_emb) # original image
img_list.append(torch.zeros_like(image_emb)) # zero image

Andy1621 avatar Jul 20 '24 01:07 Andy1621

Thank you! I will try that out

BennoKrojer avatar Jul 22 '24 19:07 BennoKrojer

@Andy1621 In the paper you say: "“VideoChat2text” denotes the model receiving blank videos and excludes LoRA tuning, relying solely on the LLM’s capacity for responses". Does that mean you ran the stage 2 model and not stage 3? Do you have any more details on whether the rest was the same such as prompting?

BennoKrojer avatar Jul 25 '24 14:07 BennoKrojer

I ran both Stage 2 and Stage 3 models in our pipeline, but with zero-out video input, but the results look quite different to the paper unfortunately. We were able to reproduce the normal however up to 1-2% margin.

We would be very grateful if you can share any more details!

BennoKrojer avatar Jul 25 '24 18:07 BennoKrojer

@Andy1621 In the paper you say: "“VideoChat2text” denotes the model receiving blank videos and excludes LoRA tuning, relying solely on the LLM’s capacity for responses". Does that mean you ran the stage 2 model and not stage 3? Do you have any more details on whether the rest was the same such as prompting?

Hi! Actually, we run Stage3 model, which is training without LoRA for a fair comparison.

image

Andy1621 avatar Jul 29 '24 01:07 Andy1621

Interesting, thank you! Do you still have the weights somewhere for this?

BennoKrojer avatar Jul 29 '24 15:07 BennoKrojer

Please try this model without LoRA. I just found it from the previous model weights and haven't tested it~

Andy1621 avatar Jul 30 '24 07:07 Andy1621

Hi, we will close this issue.

Feel free to contact us if you have other questions.

yinanhe avatar Oct 11 '24 07:10 yinanhe