Use another vocoder to produce audio from sequence?

Open EuphoriaCelestial opened this issue 4 years ago • 0 comments

As the title, can I use a trained Fastspeech model (phoneme level, HiFi-GAN vocoder) with another vocoder, like Waveglow from Nvidia? I would like to use the denoise module of Waveglow, because audio generated from my model, using HiFi-GAN, contain a little high frequency hiss, sound like someone taking a breath between the teeth. Its only appear at punctuation. I used the same dataset I use to train Tacotron2 (using Waveglow as vocoder). It also has the same problem, but can be easily fixed with denoiser module

Oct 11 '21 10:10 EuphoriaCelestial