sweet132 comments

Results 8 comments of


                                            sweet132

Large performance drop if trained with fp32.

meanP [meanP.txt](https://github.com/ArrowLuo/CLIP4Clip/files/11705891/meanP.txt) seqTransf [Transf.txt](https://github.com/ArrowLuo/CLIP4Clip/files/11705892/Transf.txt) Sorry to bother you, I have run the code directly, but the loss is NaN since some wrong videos(the solution is to set the video to...

Results on MSRVTT and MSVD

> @STARK1234 , When I reproduce the results of meanP on MSRVTT training-9k, I found the R@1 is the same as results in paper, while the R@5 and R@10 is...

Question about implementation details.

I just downloaded the code and data. It looks like 8 GPUs with 256 batch_size is essential for reproducing the project. @shams2023

Question about implementation details.

The code is based on CLIP4Clip, Version of torch is 1.11.0 and cuda is 11.6 @Tiiivoo

Question about implementation details.

The modeling section isin modeling.py, which you can find what you want @shams2023

Question about implementation details.

If you have 8GPUs for batch_size=256, the memory of GPU will be around 20GB. You can reference as the setting. I am not sure why it takes up so much...

Question about implementation details.

Thank you for your reply, although I achieved similar results to the paper on msrvtt, I got poor results on msvd(46.1), where I trained directly on the raw data, while...

Question about implementation details.

Hello, I suggest you refer to the paper, the titles are generated by model (gpt-2 or clip) @Tiiivoo