Gitsamshi comments

Results 13 comments of


                                            Gitsamshi

RuntimeError: Input tensor at index 3 has invalid shape [14, 14], but expected [14, 17]

You may either use just single gpu or use drop-last in dataloader when using multiple gpus. > Hi @lizhaohuigit1, did you solve this issue? I had the same error.

Positive Bag Negative Bag

Thanks for asking. Given a image with n regions, there are n*(n-1) region pairs. There maybe k pairs related to predicate r, which is extracted from caption. The k makes...

Hi there, Pls refer to scripts/prepro_predicates.py for getting positive bags. It should have answered your second question. For the first question, let's first make it clear about "pair" and "bag"....

Positive Bag Negative Bag

Yeah, there is a matching process in the code.

Positive Bag Negative Bag

Exactly, remember to use "data/coco_class_names.txt" to map object labels and caption words.

Positive Bag Negative Bag

Given a predicate, the neg bag is the complement to the positive bag in terms of all pairs.

Positive Bag Negative Bag

Exactly. You can use the same split as Karparthy split for train/dev/test. As we didn't have predicate annotation between each pair, I just used predicate recall over the whole image...

Positive Bag Negative Bag

Yes, I mean cut the karparthy train split into train/dev/test, and use caption predicates as reference.

Positive Bag Negative Bag

I would suggest adding 10 different top layer classifiers for each predicate and sharing other params between all predicates. That makes one model.