UMT
UMT copied to clipboard
UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or highlight detection results.
When I run the code, it is saved every epoch. Running this once will take up a lot of space. How should I set it up so that the parameters...
Hello, can this method retrieve a video in real time? The paper says that "On YouTube Highlights and TVSum, we obtain clip- level visual features using an I3D [4] pre-trained...
Hi, Can you please provide a short example on how to use one of the models from the model zoo where I just pass in path to a video and...
Hello, 1、What are "inv" and "poly" in the automatic learning rate adjustment policy? 2、ReduceLROnPlateau,LambdaLR exist?
I have my own dataset, i want to process it so to make it suitable for your model, is there any reference code to convert my .mp4 files and vector...
Hello, regarding the preprocessing done for audio feature extraction, 1. Can you provide the code that calls get_features function provided in issue #22 2. Do you call this method for...
Hi. Question about audio feature extraction. I have read issue #22. I am wondering: 1. Why is `sr` always 32000? What does it mean? 2. If I want to extract...
I have a question regarding the seed generation, did all categories used the same random seed for generating the best mAP?
Hi,thanks for your work! I have two questions: 1. I tried to input a video longer than 150s, and set query feature to be zeros([77,512]) ,then I found that the...
Hi, thanks for your work! I used the following command to test the qvhighlights dataset: `python tools/launch.py configs/qvhighlights/umt_base_200e_qvhighlights.py --checkpoint checkpoints/umt_base_200e_qvhighlights-9a13c673.pth --eval` And got the following results: ``` Evaluation results on...