Results 16 comments of Shuming Liu

Only a few hyperparameters are changed, such as the mlp ratio in the adapter is changed from 1/4 to 1/8, and the learning rate is searched between 1e-4 to 5e-5....

@Caspeerrr Thanks for your suggestion! Integrating TriDetPlus into OpenTAD seems straightforward. However, TriDetPlus only released the VideoMAEv2 feature, not the DINO2 feature. This is the only reason we haven't integrated...

Hi @akshitac8 , the side tuning model is released [here](https://github.com/sming256/OpenTAD/blob/main/opentad/models/backbones/vit_ladder.py), and we provide a training example [here](https://github.com/sming256/OpenTAD/blob/main/configs/adatad/thumos/e2e_thumos_videomaev2_g_768x2_224_side_2e-4.py) . When implementing the side-tuning with the latest OpenTAD, we find a performance...

Hi @Harry-KIT, thanks for your question. OpenTAD is designed for temporal action detection task. I think AVA2 and UCF24 are datasets of spatial-temporal action detection task, which requires additional spatial...

Thanks for your interest! We will release the code and pretrained checkpoints for the side-tuning adapter next month.

The side tuning model is released [here](https://github.com/sming256/OpenTAD/blob/main/opentad/models/backbones/vit_ladder.py). We also provide a training example [here](https://github.com/sming256/OpenTAD/blob/main/configs/adatad/thumos/e2e_thumos_videomaev2_g_768x2_224_side_2e-4.py) .

Thanks for your interest in this codebase. If you want to try out the methods on your own dataset, you basically need feature extraction + train a model and tune...

Hi @tongda, please check your ground truth JSON file. I saw that `2024-07-15 09:32:57 Train INFO: Number of ground truth instances: 0`, indicating that there is no ground truth actions.

Typically, you can optimize the following 4 hyper-parameters for better performance in end-to-end training. - the number of feature pyramid levels. - the weight of regression loss. - the number...

Perfect! Your understanding is completely correct. Since above are hyper-parameters, we need to search them to find the optimal setting given a new dataset.