am3
am3 copied to clipboard
A question about the model
Hi, Xing it's an interesting work, but I wanna , dose the text information only influence the construction of prototype? If so, when there comes visually similar query samples, e.g. a komondor and a mop, their feature embeddings must be very similar(through CNN extracted). In this condition, although the prototype of komondor and mop are dissimilar after considering the text info, but the embeddings of querys with response to above two classes are very similar, hence the final predictions of the two query samples are still very similar and confused.
I don't know whether the text information can affect the learning of CNN.