vits
vits copied to clipboard
Training a 8k Model as LJSpeech.
I have a custom dataset in English as 22050 and i trained the model, and it worked very fine.
But when I downsampled the original files from 22050 to 8000 and trained the model. the TTS output generated is very wrong (unable to understand what user is speaking). Can any one tell what could be the issue.
P.S I updated the sampling_rate from 22050 to 8000. Do i need to update any other settings?
My task is i directly want to synthesis the wav file to be 80000 Hz rather than 22050. So i want to skip the step of generating at 22050 and then resampling to 8000Hz..
Audio sample of synthesized speech at 8000 Hz is here