Pasha S issues

Results 6 issues of


                                            Pasha S

great research ! when is the code is going to be available ?

Training VALL-E

In the given vall-e example only text prefix given but in the VALL-E paper we also need to pass the 3 seconds audio prompt as prefix along with text right?...

Why Vocab Size is 1088 ?

The mode is only predicting 1026 values including codebook and special tokens bos and eos. can someone please give some clarification on this?

Training and Inference Code

Coming from arxiv website. This paper is super cool imo. Would love to train this model for my use case. Are you planning to release the training and Inference code?

Is romanization absolutely necessary?

Can we train a GPT model using text in the same language if we have audio transcriptions in that language?

Training Code !

Great research! I'm really interested in learning more about the training process. Do you have any plans to open-source the training code for the audio tokenizer and transformer? I'd love...