OpenTAD Roadmap and Feedback

We keep this issue open to collect feature requests and feedback from users, and thus keep improving this codebase.

If you didn't find the features you need in the Road Map, please leave a message here.

Thank you!

Mar 28 '24 12:03 sming256

cool

Apr 01 '24 06:04 rixejzvdl649

Hi @sming256 Thank you for sharing the toolkit. Its really amazing, I wanted to the ask, the paper AdaTAD also has the parallel adapter approach, AdaTAD' (75.4 in paper) but when implementing it using the codebase shared I am able to achieve somewhere around 73.4 mAP, can you please share if there are any parameters are different between AdaTAD and AdaTAD'?

May 03 '24 18:05 akshitac8

Only a few hyperparameters are changed, such as the mlp ratio in the adapter is changed from 1/4 to 1/8, and the learning rate is searched between 1e-4 to 5e-5. I will update the parallel backbone and checkpoint later.

May 03 '24 19:05 sming256

That would be really helpful @sming256 for reproducing the results 😃.

May 05 '24 01:05 akshitac8

Hi @sming256 wanted to check if you could please upload the parallel backbone code as well that would be great.

May 06 '24 17:05 akshitac8

Are you planning to also release tridetplus in this toolkit? (https://github.com/dingfengshi/tridetplus) thanks!

May 23 '24 09:05 Caspeerrr

@Caspeerrr Thanks for your suggestion! Integrating TriDetPlus into OpenTAD seems straightforward. However, TriDetPlus only released the VideoMAEv2 feature, not the DINO2 feature. This is the only reason we haven't integrated it now.

May 27 '24 18:05 sming256

Hi @akshitac8 , the side tuning model is released here, and we provide a training example here . When implementing the side-tuning with the latest OpenTAD, we find a performance drop of around 1% on THUMOS. In our released checkpoint, we achieve 74.65% mAP using VideoMAEv2-g.

Jun 06 '24 08:06 sming256

Hi @sming256, Thank you for an excellent work. I have a question about open source datasets not yet included in this repo.

You mentioned as follows: Support multiple TAD datasets. We support 9 TAD datasets, including ActivityNet-1.3, THUMOS-14, HACS, Ego4D-MQ, EPIC-Kitchens-100, FineAction, Multi-THUMOS, Charades, and EPIC-Sounds Detection datasets.

What about other open source datasets (e.g, AVA2, UCF24)?

Aug 19 '24 05:08 aidevveloper

Hi @Harry-KIT, thanks for your question. OpenTAD is designed for temporal action detection task. I think AVA2 and UCF24 are datasets of spatial-temporal action detection task, which requires additional spatial bounding boxes.

Aug 19 '24 06:08 sming256

I see. Thank you very much

Aug 19 '24 09:08 aidevveloper