Jinpeng Wang
Jinpeng Wang
Hi Zysty: Thanks for your question and sorry for late reply. 1. Yes, a grid may have multiple object tags. The case count a small part. Random select one object.
Hi Jin: Thanks for you question. It's hard to find what's the problem because I do not meet this problem with 4M image (one image may have multiple captions for...
hi NQX1248: Thanks for your good question. 1. The prepartion of pretraining corpus follow OSCAR (https://github.com/microsoft/Oscar/blob/master/VinVL_DOWNLOAD.md). All setting keep consistent. It's an common pratcie. 2. Yes, I have same observation!...
I do not explore PTP on Video-text Task. It should be work. Previously I save object feature and tags together in numpy file and it takes 10T space. Since I...
1. Thanks a lot nqx. Yes, I missed to upload coco_zero_shot.txt. You are correct, zero-shot mean directly test without tuning. You can find the log is for fine-tuning rather than...
Cool, I will upload the log you provided.