Provide speaker embeddings to produce mel

Open savinay opened this issue 4 years ago • 4 comments

Can we provide our own speaker embeddings to produce mel spectrograms using flowtron rather than use the speaker embeddings generated by flowtron? If yes, how should we normalize those embeddings?

Mar 15 '21 21:03 savinay

yes, re-train the pre-trained Flowtron LibriTTS2K using your own speaker embeddings, preferably at least on LibriTTS2K.

Mar 16 '21 23:03 rafaelvalle

Thanks for your reply! what are the settings to provide my own speaker embeddings? I have trained another model which generates speaker embeddings for the LibriTTS dataset. I would like to use those embeddings to train flowtron model.

Mar 17 '21 21:03 savinay

there's no pre-written code for loading speaker embeddings externally. you'll need to change flowtron.py.

Mar 17 '21 22:03 rafaelvalle

Thank you very much! that helps.

Mar 17 '21 23:03 savinay