yxlllc
yxlllc
RVC and So-Vits-Svc are similar end-to-end architectures. In fact, the spectrum is not explicitly generated during the conversion process, although HifiGAN is used (the input is a 192-dimensional hidden space...
> > RVC and So-Vits-Svc are similar end-to-end architectures. In fact, the spectrum is not explicitly generated during the conversion process, although HifiGAN is used (the input is a 192-dimensional...
I guess the reason is that the pre-trained model of rvc is trained with the VCTK dataset with a sampling rate of 48khz.
We have tried it, but bigvgan training is a bit difficult and the improvement is not obvious.
We actually have some experimental models, but the performance improvements have not met expectations. A key point may be that the performance of contentvec (hubert_base.pt) pre-training limits the final upper...
加载错模型了
Check whether the configuration file or pre-trained model used is correct
Check if the pip command is executed in the correct directory
According to my tests, only MME is the most stable driver, the others are very random, and may be a problem with the sounddevice library.
If you use a pretrained model, a few hours will be enough