agents icon indicating copy to clipboard operation
agents copied to clipboard

tts,stt: add FallbackAdapter

Open nbsp opened this issue 1 year ago • 4 comments

usage:

text_to_speech = tts.FallbackAdapter(
    openai.TTS(),
    cartesia.TTS(),
    # any others that you may want...
    cooldown=30,
)

notes:

  • [TTS] relies on sample rate and channel being the exact same across all TTS providers. won't initialize if not the case
  • only sets capabilities.streaming to true if all providers support it
  • ~~doubles as a timeout adapter, if you only set one provider~~ (update: we've decided providers should handle their own timeouts)
  • if cooldown > 0 then it reactivates a provider after cooldown seconds

issues:

  • [x] ~~doesn't work with openai for some reason? it publishes data fine but then after it's done RTC complains about mismatched audio sources~~
  • [x] TTS streaming throws an error after it's done (see note in source)
  • [x] i haven't actually checked that it switches when a provider fails, yet. dunno how to make that happen
    • [ ] switches tested working between transcriptions, still haven't tested what happens mid-transcription
  • [x] ~~timeouts are currently not preset to anything, need to figure out what a good threshold is~~ set to 5s/120s to match our OpenAI timeout
  • [x] STT FallbackAdapter untested with plain recognize, will test later, but streaming works great
  • [x] need to remember to write tests when it's ready

feedback, nits, etc appreciated

nbsp avatar Aug 07 '24 03:08 nbsp

🦋 Changeset detected

Latest commit: 09638611580220ff57a756de1e5a98c389ee7854

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
livekit-agents Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

changeset-bot[bot] avatar Aug 07 '24 03:08 changeset-bot[bot]

this is almost ready to review, just one bug left to squash: with unlucky enough timing, a new sentence could be pushed to the stream just as the timeout hits 0, but since the stream is active it thinks it timed out. need to figure out a way to refresh the timer

nbsp avatar Aug 12 '24 22:08 nbsp

this should be the entire FallbackAdapter implementation done. can be merged once streams handle their own timeouts, which i'll do in a different PR

nbsp avatar Aug 16 '24 17:08 nbsp

Awesome. Cartesia went down during a user test yesterday, so I'm excited for this!

bradyneal avatar Oct 24 '24 13:10 bradyneal

Any update on this?

bradyneal avatar Nov 13 '24 21:11 bradyneal

we're now working on this over here: https://github.com/livekit/agents/pull/1074

nbsp avatar Nov 13 '24 21:11 nbsp