MiniGPT-4 icon indicating copy to clipboard operation
MiniGPT-4 copied to clipboard

How to apply MiniGPTv2 to ITM (image-text-matching) tasks

Open Qia98 opened this issue 2 years ago • 0 comments

I would like to use MiniGPTv2 for image classification tasks, and the idea is to calculate the ITM (Image-Text Matching) score between the image and category text. Could you please guide me on how to obtain the ITM model for the MiniGPT-v2 model, similar to BLIP and BLIP2?

BLIP-itm:https://github.com/salesforce/LAVIS/blob/main/examples/blip_image_text_matching.ipynb BLIP2-itm:https://github.com/salesforce/LAVIS/blob/main/examples/blip2_image_text_matching.ipynb

These two models can straightforwardly calculate the ITM (Image-Text Matching) score between images and category text, making them suitable for image classification tasks. They remove the LLM (Long-Long Matching) module. How can MiniGPT achieve a similar implementation?

Qia98 avatar Feb 01 '24 09:02 Qia98