Gitsamshi

Results 13 comments of Gitsamshi

You may either use just single gpu or use drop-last in dataloader when using multiple gpus. > Hi @lizhaohuigit1, did you solve this issue? I had the same error.

Same here, got the result of 77.33.

Thanks for asking. Given a image with n regions, there are n*(n-1) region pairs. There maybe k pairs related to predicate r, which is extracted from caption. The k makes...

Hi there, Pls refer to scripts/prepro_predicates.py for getting positive bags. It should have answered your second question. For the first question, let's first make it clear about "pair" and "bag"....

Yeah, there is a matching process in the code.

Exactly, remember to use "data/coco_class_names.txt" to map object labels and caption words.

Given a predicate, the neg bag is the complement to the positive bag in terms of all pairs.

Exactly. You can use the same split as Karparthy split for train/dev/test. As we didn't have predicate annotation between each pair, I just used predicate recall over the whole image...

Yes, I mean cut the karparthy train split into train/dev/test, and use caption predicates as reference.

I would suggest adding 10 different top layer classifiers for each predicate and sharing other params between all predicates. That makes one model.