ViLT icon indicating copy to clipboard operation
ViLT copied to clipboard

Flickr30k Image and Text Retrieval - Query regarding training

Open gchhablani opened this issue 3 years ago • 2 comments

In this line the answer is being initialized to zeros and never changed. I am not able to understand how this helps with both positive and negative examples.

Can someone please clarify how to use the output from the logit in order perform a pseudo-classification task, i.e. image-text match, or not match from the Flickr30k checkpoint.

gchhablani avatar May 16 '22 06:05 gchhablani

In this line the answer is being initialized to zeros and never changed. I am not able to understand how this helps with both positive and negative examples.

Can someone please clarify how to use the output from the logit in order perform a pseudo-classification task, i.e. image-text match, or not match from the Flickr30k checkpoint.

@gchhablani Hi, bro, I'm also confused about this. Do you know why now?

mactavish91 avatar Oct 23 '22 12:10 mactavish91

@gchhablani Hi, bro, I'm also confused about this. Do you know why now?

DataminingdidiYR avatar Jan 07 '23 09:01 DataminingdidiYR