agent-zero icon indicating copy to clipboard operation
agent-zero copied to clipboard

Kokoro-TTS should be default speech model

Open Omni-NexusAI opened this issue 7 months ago • 6 comments

The current TTS implementation is not very good and is very robotic sounding. Kokoro TTS has voice options, is much better at around the base model size, and for a much higher speech quality and speed than all the other models.

Omni-NexusAI avatar Jun 11 '25 16:06 Omni-NexusAI

Kokoro has been tested already and is in the roadmap now.

frdel avatar Jun 11 '25 16:06 frdel

That's good, will keep a lookout for future updates.

Omni-NexusAI avatar Jun 11 '25 17:06 Omni-NexusAI

Chatterbox > Kokoro :)

pmb2 avatar Jun 15 '25 15:06 pmb2

Chatterbox > Kokoro :)

Better in quality yes, but not speed. Even in quality not by much. For the size of the model Kokoro is lighting fast, almost instantaneous. It's output will be done by the time the LLM even responds, and you need that kind of speed for agents. Quality will be good enough for the speed Kokoro will give.

Omni-NexusAI avatar Jun 15 '25 19:06 Omni-NexusAI

I would opt more for a open compatible url endpoints for the tts and stt so people have multiple options what they wanna run.

netixc avatar Jun 16 '25 10:06 netixc

I have the Kokoro integration finished -- just needs a bit of testing.

The current TTS implementation is not very good and is very robotic sounding. Kokoro TTS has voice options, is much better at around the base model size, and for a much higher speech quality and speed than all the other models.

TerminallyLazy avatar Jun 29 '25 21:06 TerminallyLazy

I've been messing around with A0 past couple days with the new kokoro integration. It's good, but a couple of things are missing. There doesn't seem to be the ability to select the different voices that Kokoro has on offer. The model has about 30+ pre-trained voices over different languages and accents to choose from. You should be able to choose which one you want, aside from just this default one. Also, another great add is if you could also assign different voices to different agents, so that you can add more to their personality of sorts. Would be something that can be added as a quality of life improvement in the next update.

Omni-NexusAI avatar Jul 24 '25 13:07 Omni-NexusAI