Rui Yan
Rui Yan
@hdmjdp How did you solve this issue?
Thanks for your reply. Yes, but some of the details are different. For example, each video in MSRVTT-retrieval has more than one captions. Do you use the fisrt one or...
Thanks. But I cannot reproduce the results of zero-shot retrieval on MSRVTT. I use the following settings: 1, JSfusion (9000 for train and 1000 for test) 2, For test, we...
@tsujuifu Hi, I have checked again. I used specific caption idx's in jsfusion provided by "jsfusion_val_caption_idx.pkl". So I want to know how do you get caption during testing on MSRVTT?
Thanks for your kindly helps. I have repoduced the results of MSRVTT-retrieval (R@1: 33.7) with finetuned ckpt (provided by you), but still cannot get promising results on the zero-shot setting.
@joaanna Thanks for your help.
@xxxzhi I obtained the reported performance by setting num_frames=8, batch_size=72, and coord_feature_dim=512.
@xxxzhi Make sure that FPS is 12 and the boxes correspond to the frames.
> @ruiyan1995 have you tried the globel_i3d and get the expected result? I tested the VideoModelGlobalCoordLatent without overriding the default train function, it works. I guess the your problem is...