FastSpeech2 icon indicating copy to clipboard operation
FastSpeech2 copied to clipboard

My generated outputs all have a beeping sound, althought the alignment is correct.

Open wolfassi123 opened this issue 3 years ago • 3 comments

I have been training on my own custom data for a while now. I used an aligner and the alignment seems to be working fine. I added the TextGrid to the model and trained for around 2 hours using GPU (I have around 40 minutes of Augmented Data), but all of my synthesized outputs come out as beeps. Any idea what to do to solve the issue. Should I be using more data? More training time? Is my data bad?

wolfassi123 avatar May 10 '22 22:05 wolfassi123

@wolfassi123 Did you able to fix it? I am also getting beeping sound with training a model on LJSpeech dataset.

samin9796 avatar Jun 09 '22 13:06 samin9796

Did you able to fix it? I faced the same problem @wolfassi123 @samin9796

zaynabmu avatar Jun 19 '22 14:06 zaynabmu

@zaynabmu @samin9796 Did you fix it?I face this problem when i use the frame level features of pitch and energy.The quality of synthesized audio (including train and val data) is good in the trainning phase,but the quality of audio synthesized in inferencing phase is bad.

hhm853610070 avatar Mar 12 '23 13:03 hhm853610070