InternVideo issues

Hello, are the 6k action words available?

Thank you so much for the awesome repo. Can you please share the 6k action words? It would be useful to perform zero shot classification of videos into those 6K...

calci

[Help requested] Inference InternVideo2_clip model.

40

Hello InternVideo team, You guys have done a great job with this project! In your paper, you use the Stage 2 model for the task of temporal grounding on QVHighlight...

gracikk-ds

NEED HELP: Action Classification low performance

4

Problems: Use demo to test action classification on kinetics-700 validation set but get very poor result Experiment: 1. Pretrained model: https://huggingface.co/OpenGVLab/InternVideo2-Stage2_1B-224p-f4/tree/main 2. text candidate：use the class name of k700 dataset...

Major1994

运行 InternVideo2/single_modality/scripts/finetuning/full_tuning/k400/1B_ft_k710_ft_k400_f16.sh 报错

我运行这个脚本，在timm.models.create_model 报错 RuntimeError: Unknown model (internvideo2_1B_patch14_224) 该怎么解决

Major1994

Method of running evaluation on MSR-VTT dataset

2

Thanks for the paper and the open sourcing the code base. I would like to know how evaluation is performed on the MSR-VTT dataset for zero shot text to video...

sartaki

How to get stage1 training data

It seems need 1.1M.tsv In scripts/pretraining/1B_pt.sh. the format should like: # line format: source, path, total_time, start_time, end_time, target but the [UniFormerV2] provide # line format: path, id so where...

qianwangn

Demo script for Internvideo2 model for Video Question Answering or Summarization task

5

Dear Authors, How can I use the Internvideo2 model for Video Question Answering or Summarization tasks given a video? Please provide a demo script if any for testing on new...

Varun-GP

Request to release multimodal finetuning for Internvideo2

1

I request Authors to release finetuning for Internvideo2 model with multimodality: [https://github.com/OpenGVLab/InternVideo/tree/main/InternVideo2/multi_modality#finetuning](multimodal-finetuning)

Varun-GP

Can I adapt the model for video prediction(like moving mmnist)?

2

Thanks for the great work! Is it possible to do adapt the model for video prediction? And if so, what decoder model shall I use? Thanks for any suggestions!

Crestina2001

Checkpoint of internvideo2

1

Hi, the link of internvideo2 checkpoint shows "404".

yahooo-m

InternVideo
InternVideo copied to clipboard

Metadata

Hello, are the 6k action words available?

[Help requested] Inference InternVideo2_clip model.

NEED HELP: Action Classification low performance

运行 InternVideo2/single_modality/scripts/finetuning/full_tuning/k400/1B_ft_k710_ft_k400_f16.sh 报错

Method of running evaluation on MSR-VTT dataset

How to get stage1 training data

Demo script for Internvideo2 model for Video Question Answering or Summarization task

Request to release multimodal finetuning for Internvideo2

Can I adapt the model for video prediction(like moving mmnist)?

Checkpoint of internvideo2

← Metadata

Owner

Metadata

InternVideo InternVideo copied to clipboard

Metadata

← Metadata

Owner

Metadata

InternVideo
InternVideo copied to clipboard