FireRedTTS
FireRedTTS copied to clipboard
About Word Error
Thank you for the paper and code.
Voice cloning is excellent, however, for text to speech I'm experiencing word errors. Here is an example
Transcript: "This comprehensive set of behaviors allows for a human-like speech synthesis system, capable of mimicking the complex vocal patterns observed in natural human conversations."
Language: en
Output: audio output
Any ideas and plans to fix this?
Firstly I broke into a crazy laugh because he said 'comprehensive set of pigs' >< ><. But yeah many times I also faced the same problem in many sentences, maybe in the future they'll fix it but for now no idea _(--)_/