vits icon indicating copy to clipboard operation
vits copied to clipboard

Any guidelines for tuning noise_scale_w?

Open TinaChen95 opened this issue 2 years ago • 2 comments

I found that adjusting noise_scale_w has an effect on the smoothness of the synthesized speech When noise_scale_w is close to 1, the speech speed is slower and the speech is more intermittent When noise_scale_w is close to 0, the speech speed is fast and the intonation is flat. Do you have any experience on how to adjust noise_scale_w?

TinaChen95 avatar Jun 11 '23 02:06 TinaChen95

Use translator to read generation params description here: https://github.com/w4123/vits

nikich340 avatar Jun 12 '23 04:06 nikich340

Thanks very much. Does it mean that we can only try multiple values and listen to the audio to choose the value? Or is there better way to decide?

Also, I've found that some datasets work fine when noise_scale_w=1, but some datasets the synthesized speech is stuttering. Why is this? Does that mean I should train longer?

TinaChen95 avatar Jun 15 '23 09:06 TinaChen95