Comparing image-text pairs

Open havardox opened this issue 2 years ago • 2 comments

I'm exploring CLIP for similar product retrieval by combining a product's description and image as input. As I understand, CLIP excels at image-to-text and text-to-image retrieval tasks, but I'm curious about its capability to handle integrated text and image inputs. Is this possible with CLIP and does anyone have examples?

Nov 13 '23 02:11 havardox

“Google image search by image & text” can do this

Nov 14 '23 02:11 Suasy

Please check this: https://github.com/openai/CLIP/issues/115.

May 05 '24 06:05 shyammarjit