qq283215389
qq283215389
ok, thanks a lot, for another VSE model(VSEAttModel) and "pair loss" , whose result isn't shown in your paper "Discriminability objective for training descriptive captions" in CVPR 2018?
thanks!if the retrieval model perform better(like the paper“Stacked Cross Attention for Image-Text Matching”),can we get a better result for captioning model?
hello,luo It's my result of pre-training retrieval model after i run “run_fc_con.sh”, there is still a difference with your result presented in your paper for the retrieval model. Result: Average...
i might get the problem,i have used the size of 7x7 for coco fc features, i think u have used 14x14 for coco fc features?
I found other paper use Karpathy'split for COCO, your paper use rama's split, whose test data are the same? why you can compare your result with the result in self-critical?