CLIP How can I get accuracy metrics when training?

Hello! Thank you for sharing this awesome model!

I am trying to fine-tune CLIP model to my own dataset. It seems many people talked about the accuracy when they evaluated their model performance, but I could only get loss to evaluate my model performance. Is there any reference code I can refer to for getting accuracy?

Thanks!

Feb 20 '22 21:02 dongyun-kim-arch

"accuracy" during training probably meant the proportion of the training examples that had correctly predicted the contrastive label, e.g.:

contrastive_label = torch.arange(batch_size)

image_loss = cross_entropy(image_logits, contrastive_label)
text_loss = cross_entropy(text_logits, contrastive_label)

image_acc = (image_logits.argmax(dim=-1) == contrastive_label).float()
text_acc = (text_logits.argmax(dim=-1) == contrastive_label).float()

Apr 11 '22 01:04 jongwook

"accuracy" during training probably meant the proportion of the training examples that had correctly predicted the contrastive label, e.g.:
contrastive_label = torch.arange(batch_size)

image_loss = cross_entropy(image_logits, contrastive_label)
text_loss = cross_entropy(text_logits, contrastive_label)

image_acc = (image_logits.argmax(dim=-1) == contrastive_label).float()
text_acc = (text_logits.argmax(dim=-1) == contrastive_label).float()

@jongwook hello! during fine-tune the model on my own dataset with the format of image-caption, if i should use 'image_acc = (image_logits.argmax(dim=-1) == contrastive_label).float()' directly?

Apr 18 '22 13:04 newbietuan