ImageBind icon indicating copy to clipboard operation
ImageBind copied to clipboard

Same vector embedding output for different text inputs

Open raise-hanct opened this issue 2 years ago • 3 comments

I tested imagebind hug with lyrics song (about 30s), and I found out that for some different lyrics, I got the same embedding. For example 2 below inputs got the same embedding: input 1: "Yo, I'll tell you what I want, what I really, really want So tell me what you want, what you really, really want I'll tell you what I want, what I really, really want So tell me what you want, what you really, really want I wanna, (ha) I wanna, (ha) I wanna, (ha) I wanna, (ha) I wanna really, really, really wanna zigazig ah If you want my future, forget my past If you wanna get with me, better make it fast Now don't go wasting my precious time Get your act together we could be just fine" input 2: "Just like fire, burning up the way If I can light the world up for just one day Watch this madness, colorful charade No one can be just like me any way Just like magic, I'll be flying free I'ma disappear when they come for me I kick that ceiling, what you gonna say? No one can be just like me any way Just like fire, uh"

Output embedding : [[-0.5404723 1.5690608 2.6174846 ... 2.7306266 0.41771093 0.2987784 ]]

Is it a bug in the model? Or maybe because my input sentences is too long?

raise-hanct avatar Sep 05 '23 04:09 raise-hanct

Same issue

tringo-fika avatar Sep 05 '23 04:09 tringo-fika

can u please share your code?

vzapylikhin avatar Sep 23 '23 14:09 vzapylikhin

I also notice this happening for longer sequences: #82. Seems like truncation is not handled properly in the code.

bakachan19 avatar Dec 10 '23 20:12 bakachan19