TransferTTS
TransferTTS copied to clipboard
Speech synthesis results
Hello @hcy71o ,
Liked your work in Transfer TTS and SC VITS. I have trained a model up to 350000 steps using LibriTTS train clean 100 dataset only but when I synthesize results using some random audio file the speech is not clear.
So, my question is:
-
How many steps did you train your model?
-
What should be the length (duration) of audio files while passing to inference.py.
-
Also should the reference audio be a part of the training data speaker, or can it be unseen?
-
Do you have any demo page where we can see the comparison of Transfer TTS generated audio with VITS?
Thanks