Can this repo clone the original voice and generate a voice file with the speaker voice ?
I'm a little bit confused about what can this repo really do, the readme file says that we can clone the voice, but in issues I found that this repo can only clone the tone color of the speaker, and I don't know what is this exactly mean ?
Yes it can clone voices. What this repo means by tone color is that emotion or volume isn't really converted but rather the actual style. This codebase works by a tts model generating speech and a voice converter to make it sound like your speaker you want to clone. Emotion and volume is controlled by the tts model and you can actually swap that out.
Thank for your reply @johnwick123f
Could you please tell me what is the name of the model that clone the voice ? The ToneColorConvertor model generates voices with a completely different voice