inconnu11 issues

Results 13 issues of


                                            inconnu11

Quality of converted speeches

Hi, I synthesized converted speeches of this three models, VAE, CDVAE and CDVAE-CLS-GAN separately. The results of CDVAE-CLS-GAN model sound worst. Is it supposed to be like this? Or anything...

How to align multiple sequences while they are from different source?

If the length of content code, rhythm code and pitch code is different from each other, how do they align since there is no attention mechanism in decoder?

mel spectrogram normalization range

Hi, I observed that the range of spectrogram saved in npy file is -0.2 ~ 0.8. I am wondering why you normalize spectrogram into this range? For what reason?

The mechanism of alignment between text encoder output and audio_seq2seq output

Hi, Zhang Could you please explain how the text encoder output and recognition encoder output align? it is stated in your paper as "The recognition encoder Er is a seq2seq...

Pre-emphasis

Hi, in your code, you did not pre emphasis the wav before extracting the mel spectrogram?

Hi, why do you update learning rate before optimizer.step() in [code](https://github.com/ming024/FastSpeech2/blob/master/model/optimizer.py#L22-L24)? Should not we conduct optimizer.step() fist and then conduct scheduler.step()? Or is there some other consideration?

inconnu11

Quality of converted speeches

How to align multiple sequences while they are from different source?

mel spectrogram normalization range

The mechanism of alignment between text encoder output and audio_seq2seq output

Pre-emphasis

optimizer and scheduler

pitch/energy corpus normalization

downsample in content encoder

How to save only the weights of trainable parameters?

Prosody Loss