Ray issues

Results 4 issues of

Ray

the difference between semantic tokens like HuBERT, wav2vec2 and Whisper

Thanks for your great work on WhisperSpeech! It is very interesting to extract semantic tokens from the Whisper encoder, and use semantic tokens to generate acoustic tokens. I am a...

Audio format in dataset files

Thanks for you great work on implementing FACodec! I found the data file in https://github.com/Plachtaa/FAcodec/blob/master/data/val.txt has some labels, like speaker id, phonemes. How can I get these labels? Will these...

[Bug] loss is NaN when fine-tuning XTTS_v2!

### Describe the bug When fine-tuning XTTS_v2 on LJSpeech following the script (recipes/ljspeech/xtts_v2/train_gpt_xtts.py), the loss is always NaN! So terrible. I try to reduce the learning rate, change the batch...

bug

Where is the UniCATS testset-B?

I found that X-LANCE's work [1, 2] also takes UniCATS testset-B as the test set. Maybe it is a good choice to open source the UniCATS testset-B for reproduction. [1]...