DiCoSA icon indicating copy to clipboard operation
DiCoSA copied to clipboard

[IJCAI 2023] Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment

Results 7 DiCoSA issues
Sort by recently updated
recently updated
newest added

Hi, Congrats on your amazing work! Can you please upload the MSVD checkpoint and steps for inference?

Thank you for sharing such a great job! You concatenated the latent factors of text and video subspace to calculate similarity through MLP, which means that during the testing phase,...

I'm getting strange results when running the code on an RTX 3090 GPU. I first used the code in CLIP4Clip to compress the video size to 3fps : https://github.com/ArrowLuo/CLIP4Clip/blob/master/preprocess/compress_video.py and...

Hi, I am facing the issue when trying to train on the MSVD dataset. I got the errors as the message below. command: torchrun main_retrieval.py --do_train 1 --workers 8 --n_display...

Hello, Thank you for the nice work. I have a question on the representation projection. In your paper, the text and video representation are independently project into K components with...

Hello, I found that you used QB-Norm postprocessing in inference stage while there is no mention about qb-norm in paper, can you show the result without qb-norm? thank you for...

I'm unable to reproduce the scores reported in the paper. Below are my MSRVTT training/testing results. Could you please advise? ![Image](https://github.com/user-attachments/assets/ba042e03-53dc-4d26-a8ad-47fca45a0c77) ![Image](https://github.com/user-attachments/assets/6177ab58-fc49-43ca-8499-4d64aa0694ea) My settings are as follows: ``` CUDA_VISIBLE_DEVICES=0,1 \...