ImageBind
ImageBind copied to clipboard
Image to audio
Which decoder would work best to go from image to audio embedding to then the actual sound?
Meta's AudioCraft is a start. Will obviously need some training to convert between the two embedding spaces.