Grad-SVC
Grad-SVC copied to clipboard
Diffusion Singing Voice Conversion based on Grad-TTS from HuaWei
如题,想请教一下Speaker Encoder是怎么训练出来的,有参考的代码吗
电音现象问题请教
想请教一下,在经过扩散模型之前的声学模型,也就是从hubert 到 mel的这个阶段,这个出来的mel直接送到声码器,为啥会有电音现象呀,按理来说,hubert已经包含足够多的信息了,为什么生成的mel谱还有那么多平行的共振峰呢?楼主有没有试过用wavLM替代hubert呀?
Hi, Can you explain why skip diffusion train before the configured fast_epochs? And how many epochs does diffusion train need? Thanks!
训练数据量?
想问一下,预训练模型用了多少数据量训出来的。
Does SVS work in english lyrics?
Hi @MaxMax2016 Thank you for this wonderful project What is the advantage for Grad-SVC, compare to So-VITS-SVC?
Thanks for noticing Better Diffusion Modeling Technology. Recently, Xue et al. proposed that [Multi-GradSpeech](https://arxiv.org/abs/2308.10428) using Consistent Diffusion Model as the generative network outperforms Grad-TTS in both single- and multi-speaker scenarios,...
There are variety of version in Grad SVC (V3 CFM, V3 CFM RoPE, V2 96, etc..), but what is the best version?
### Thank you for the amazing open-source project. However, I am facing an issue when inference, so I have two questions. 1. Is it necessary to separate the vocals and...