InvokeAI icon indicating copy to clipboard operation
InvokeAI copied to clipboard

[Feature Request]: Image to Description Model

Open snapo opened this issue 3 years ago • 3 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Contact Details

Through Github

What should this feature add?

Is there are feature planned to automatically create descriptions of a image? (like the reverse of what users currently to from text to image) Sometimes i have an original image and would like to get the same style as the image i already have and a description of the image to see what the AI thinks. Would something like this be possible? (I am not talking about style transfer image-to-image).... i would really be interested in image-to-text description that is as accurate as possible.

Alternatives

No response

Aditional Content

No response

snapo avatar Dec 20 '22 09:12 snapo

CLIP Interrogator
https://colab.research.google.com/github/pharmapsychotic/clip-interrogator/blob/main/clip_interrogator.ipynb

BLIP
https://huggingface.co/spaces/Salesforce/BLIP

n00mkrad avatar Dec 20 '22 17:12 n00mkrad

@n00mkrad thank you very very very much, i didnt know that such a model already exists... BLIP "image captioning" is exactly what i was looking for :-) Just noticed the descriptions are extremely small, i was expecting a description with 5-10 sentences.

For example for a person foto: "A Foto of person that is around 62 years old, the body stature of the person is slim. The haircolor is dark brown and the hair length is down to the shoulders. The person wears red jeans with a black belt and a white shirt with pockets. The background shows he is walking on a stone plated way near a river where the water color is blue and there is a lot of vegitation like trees and bushes probably somewhere in Japan. The Sun is in a sunset position and he seems to be freezing."

snapo avatar Dec 21 '22 04:12 snapo

Shouldn't this be integrated as a feature to invokeAI?

ParisNeo avatar Feb 15 '23 10:02 ParisNeo

There has been no activity in this issue for 14 days. If this issue is still being experienced, please reply with an updated confirmation that the issue is still being experienced with the latest release.

github-actions[bot] avatar Mar 13 '23 06:03 github-actions[bot]

Still would be really nice to have, the other webui already has this feature as an extension. This should absolutely be a feature

Randomblock1 avatar Apr 15 '23 02:04 Randomblock1

Adding CLIP interrogator or BLIP would be an excellent addition as a Tab or feature to have directly implemented into the web interface.

RadioactiveDeveloper avatar May 03 '23 19:05 RadioactiveDeveloper