Haoran Bai
Haoran Bai
Heya creator, thanks for your amazing contribution to implement diffwave model. I have a general question about the fientune the diffwave. If I use only one prompt voice to fine-tune...
the required spectrogram form is like [N,C,W]. spectrogram = # get your hands on a spectrogram in [N,C,W] format could you please explain these three dimensions? I use the code...
How to transfer a wav format voice file into the npz format prompt? I record my voice and want to use my voice to read the text.