FAVDBench
FAVDBench copied to clipboard
[CVPR 2023] Official implementation of the paper: Fine-grained Audible Video Description
Thanks for the great work. How to extract audio-only relevant captions? The paper mentions that the last 1-2 lines are audio-relevant but I could not find any markers to extract...
Hi, can I know what was the GPUs used to train the model?
Can you provide a demo script to run on videos?
Can you share the trained model weights or provide non distributed training settings (single machine single GPU) How long does it take for you to train once? Thank you very...
Hi, Thanks for your wonderful work on this paper, you guys did a good job! Can you share the pretrained weights? Thanks ahead for your help!