FastSpeech2
FastSpeech2 copied to clipboard
Use another vocoder to produce audio from sequence?
As the title, can I use a trained Fastspeech model (phoneme level, HiFi-GAN vocoder) with another vocoder, like Waveglow from Nvidia? I would like to use the denoise module of Waveglow, because audio generated from my model, using HiFi-GAN, contain a little high frequency hiss, sound like someone taking a breath between the teeth. Its only appear at punctuation. I used the same dataset I use to train Tacotron2 (using Waveglow as vocoder). It also has the same problem, but can be easily fixed with denoiser module