Wassim (Wes) Bouaziz
Wassim (Wes) Bouaziz
Thanks for the quick update ;) I'll have a look at it and let you know if there's an issue ;)
@lucidrains thanks for your response! I have completed the first bullet point, but I'd be more than glad to see your implementation of this!
I completed (I think) both points, you can see it [here](https://github.com/wesbz/SoundStream)
Hi, thanks for notifying me! I'll try and use your version as it seems very well coded and to avoid useless code duplication for my implementation of SoundStream :joy: I'll...
Also! Regarding your implementation of factorized codes and $l_2$-normalized codes, I don't think they're used in SoundStream so I wouldn't have the occasion to test them :sweat_smile:
Oh and by the way, something still missing (but I do not need it so I didn't mention it in my first comment) from the SoundStream article is the bitrate...
Yes, I was also wondering about the first use of EMA. I don't know what is commonly done when working with this statistics. Regarding the threshold, what do you mean...
In SoundStream, the encoder is built in such a way that 24kHz waveform are transformed into a 75Hz embedding. So you only need a batch of 16×1 second audio samples...
How is this going? :)
:+1: I encounter the same issue!