vits Questions about the 44KHz audio file train.

Hello,

I trained at 44KHz for a higher quality VC because the results were good when I trained with VCTK 22KHz.

At this time, the result of TTS inference was to read the text very quickly.

Regarding the above phenomenon, can you tell me if there are any parameters I need to adjust when learning 44KHz voice rather than 22KHz voice?

Jul 19 '21 02:07 rlarhk147

You probably need to set parameter „sampling_rate“ in the data section of the config file during training.

Anyway, human speaking voice has no relevant information above 8Khz so in my humble opinion a sampling rate of 22KHz is sufficient.

Aug 29 '21 12:08 domcross

Hello,

I trained at 44KHz for a higher quality VC because the results were good when I trained with VCTK 22KHz.

At this time, the result of TTS inference was to read the text very quickly.

Regarding the above phenomenon, can you tell me if there are any parameters I need to adjust when learning 44KHz voice rather than 22KHz voice?

What batch size did you use and how much VRAM did it cost to your GPU?

Nov 23 '21 16:11 nikich340

@rlarhk147 How was your training with 44 KHz audio training file? Did it produce a good result?

Oct 05 '22 04:10 tuannvhust