CLIP
CLIP copied to clipboard
Comparing image-text pairs
I'm exploring CLIP for similar product retrieval by combining a product's description and image as input. As I understand, CLIP excels at image-to-text and text-to-image retrieval tasks, but I'm curious about its capability to handle integrated text and image inputs. Is this possible with CLIP and does anyone have examples?
“Google image search by image & text” can do this
Please check this: https://github.com/openai/CLIP/issues/115.