i-Code
i-Code copied to clipboard
Environment encoder V in CoDi
Hello, thanks for sharing this work!
Need to figure it out something in CoDi. Is the environment encoder V in paper as clap_encode_audio like this ?
For what I understand, the environment encoder V is included in the overall encoder of each modality. So, for the audio modality, the modality encoder and the environment encoder (or a later projection to the same space) are already included in the clap_encode_audio function. Check the other encoders for text or image. There is first the encoding and then a projection layer.