examples: added trigger-phrase agent example
⚠️ No Changeset found
Latest commit: e09049f7a9a38f9284bf5c0d050d89752f5b337d
Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.
This PR includes no changesets
When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types
Click here to learn what changesets are, and how to add one.
Click here if you're a maintainer who wants to add a changeset to this PR
- it's a bit slow. worth looking into
I think it is mainly due to the 0.5 sec timeout set for the VAD, and maybe partly due to the computation that needs to happen on every END_OF_SPEECH event. I am not sure the best way to address them though. Since the primary goal of this example is to show the users a way to use transcribed words to trigger the LLM, I didn't go down the path of ensuring minimum possible latency like VoiceAssistant does.
- semantically this should probably be inside the
voice_assistantexamples directory
Even though technically this is a voice assistant, since we are not using the VoiceAssistant class, I feel like it would be confusing and counter intuitive to the users if we placed in that directory and hence resorted to a stand alone example directory. What do you think?
I think it is mainly due to the 0.5 sec timeout set for the VAD, and maybe partly due to the computation that needs to happen on every END_OF_SPEECH event.
in my testing i encountered closer to three or sometimes four seconds of silence before the response started playing. this doesn't need to be fully optimized as an example, but at this point it is hurting the effectiveness of the demo.
re: directory, disregard; did not notice this doesn't actually use VoicePipelineAgent.
- STT transcriptions is now added ✅
- VAD is removed, both due to issues with adding StreamAdapter to Deepgram and also hopefully to reduce latency
- first_participant constraint removed
@s-hamdananwar this is how I was able to manage "multiple" participants in a single raise hand queue, check out the PR and let me know if this can help resolve the issue of still only listening to the first participant that joins the room.