InternVideo
InternVideo copied to clipboard

→

Metadata

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Reame
Issues

Results 170 InternVideo issues

Sort by recently updated

Demo for evaluation and usage of the model.

5

comment

Hi, When can we expect the demo for model usage to be released? Best regards, Archana

Model used for feature extraction in Temporal Action Localization

Hello! I have been able to successfully use the features posted in [Temporal Action Localization](https://github.com/OpenGVLab/InternVideo/tree/main/Downstream/Temporal-Action-Localization) to reproduce great results on their respective datasets. Are you planning to release the weights...

christian-matroid

Evaluation on AVA-Kinetics

1

comment

Hi, I was wondering if the code for evaluating on AVA-Kinetics is available somewhere in the current codebase. In the file 'https://github.com/OpenGVLab/InternVideo/blob/main/Downstream/Spatial-Temporal-Action-Localization/run_class_finetuning.py' L239-242, it seems that only evaluation on AVA...

The issue of Temporal-Action-Localization

11

comment

Hello, thanks for your good work! I encountered some problems when reproducing the performance of Temporal-Action-Localization task: Thumos14: 69.11 average mAP (lower than 71.58). The input_dim of feature is not...

extract frames of FineAction

May I ask which code you use to extract frames of FineAction? I use https://github.com/open-mmlab/mmaction2/blob/main/tools/data/build_rawframes.py to extract frames, but some videos cannot be extracted.

When would you release the trained checkpoint of action recognition?

Hope for the release of fine-tuned checkpoint of AVA

Thanks for the great work! The model zoo has already released many pretrained model weights and task ones, but there is still a lack of checkpoint for the AVA dataset....

Environment setup to use InternVideo

Hello I'm trying to use your project for some project them I'm working on, I've trouble to setup the environment with conda if you have a yaml file that I...

will you release audio modality pretrained model of InternVideo2？

test_num_segment、test_num_crop

ssv2. full_tuning --test_num_segment 2 \ --test_num_crop 3 \ to --test_num_segment 4 \ --test_num_crop 3 \ ``` InternVideo/InternVideo2/single_modality/models/internvideo2.py", line 527, in forward x = x + pos_embed RuntimeError: The size of...

‹
1
2
3
4
5
6
7
8
9
10
...
16
17
›

← Metadata

1.3k

Stars

85

Forks

Watchers

Owner

Metadata

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding