alisakgg

Results 7 comments of alisakgg

"upsample_rates": [2,5,4,4], "upsample_kernel_sizes": [16,15,4,4], "upsample_initial_channel": 512, "resblock_kernel_sizes": [3,7,11], "resblock_dilation_sizes": [[1,3,5], [1,3,5], [1,3,5]], "resblock_initial_channel": 256, "segment_size": 5120, "num_mels": 80, "num_freq": 512, "n_fft": 512, "hop_size": 160, "win_size": 512, "sampling_rate": 16000, i use...

![image](https://user-images.githubusercontent.com/37888350/53148691-8284c400-35e6-11e9-9d4a-da4843ba2bfa.png) my code with train1.py stunk here, anyone konw how to solve it?

give the command python train2 -ckpt which-model

A mandarin multi-speaker dataset was used for pretraining. Another Chinese speaker was used for finetuning.

I mentioned that only the decoder and speaker embeddings have gradients during finetune. If the decoder weights should have no grad except the condition layer norm?

> Do you set num_speaker in model config equal to number of speakers in mandarin dataset in pretrain stage? yes. i use the default config "num_speaker: 955". There are 30...

> > > > You have to change default config "num_speaker" equal to 30 (in your case) in pretrain stage. When finetune, just set your speaker_id = 0. ok, i...