zhouyong64

Results 8 issues of zhouyong64

The CER calculation seems wrong. For example, with target string 'a' and prediction string 'b', I think CER should be 100%. Instead, current code outputs 0. func = SequenceError() local...

The pretrained Mandarin acoustic model performs pretty good alignment. I'm wondering which dataset is used for its training? I tried training a new acoustic model with a Mandarin dataset of...

NSVB is trained on PopBuTFy with 34 speakers. Even with the 30-hour internal singing data as described in the paper in the training of Stage1 , I doubt that this...

It would be very helpful to have an API for returning output from intermediate layers, for example, the one before the final layers. This output can be used in other...

Seems like the vocoder pretrained model is missing.

I trained with LibriTTS 100 and LibriTTS 360 for 1.5 days and the model still can't output audible speech. I'm not sure if it's because of the small dataset size...

For using Encodec, current code only supports 24KHz audios. So when training CoarseTransformer and FineTransformer, the input wave data also need to be 24KHz?

There is a 'train-stage' option in trainer.py In egs/libritts, there is two training precedures with different 'train-stage' options. Which is better in terms of synthesis results?