agents Changing tts voice during the conversation

Hello, I want to change tts voice during the conversation (with a function call) but the VoiceAssistant does not have the ability to update the tts voice while running. Does anyone have an idea how to achieve that?

Sep 28 '24 18:09 olivvein

I think the starting point would be to close the AgentOutput on the VoiceAssistant and then create a new one as a result of the function call.

Oct 01 '24 09:10 jezell

How to close the agent output? I have tried to close the assistant to create a new one. But i cannot close the old one.

Oct 01 '24 13:10 olivvein

There is an AgentOutput class that gets instantiated by the agent and manages the playback, I believe you'd have to fork the agent but if you close it and create a new one it looks like you can probably change the voice. At least that would likely be the best place to start from what I can see.

Oct 02 '24 15:10 jezell

I did something very similar with a modified TTS class. I ask the LLM to include emotions and speed changes during its response generation and then update the tts using the before_tts_cb callback.

For example:

if isinstance(text, str):
    text, new_opts = parse_options(text)
    if new_opts:
        agent._tts.update_opts(new_opts)
    return text

For streaming output, I have to wait accumulate the text chunks until the an end marker is present:

async for chunk in text:
	  buffer += chunk
	  if buffer.count("</options>") == 1 and not parsed:
	      buffer, new_opts = parse_options(buffer, pronunciation)
	      if new_opts:
	          agent._tts.update_opts(new_opts)

Oct 11 '24 15:10 ChenghaoMou

we are going to make updates cleaner in the framework. Here's how I'm doing it for the voice assistant we built with Cartesia: https://gist.github.com/davidzhao/5738f0e2d434dea6e5224262ee5c3cfa

Oct 11 '24 16:10 davidzhao

we are going to make updates cleaner in the framework. Here's how I'm doing it for the voice assistant we built with Cartesia: https://gist.github.com/davidzhao/5738f0e2d434dea6e5224262ee5c3cfa

Hello, do you know how to save the speech and TTS results to files while chatting with the agent?

Oct 15 '24 08:10 BaiMoHan

Added update_options to most TTS in this PR

Oct 15 '24 22:10 theomonnom