agents min_endpointing_delay parameter doesn't work as expected

Issue Summary

When initializing the VoiceAssistant with min_endpointing_delay=5, the assistant does not wait 5 seconds before sending the transcription to the LLM. This results in multiple requests being sent when the user makes brief pauses during speech, causing slower inference and incomplete responses.

According to the documentation, this parameter is described as:
“Delay to wait before considering the user finished speaking.”

Expected Behavior

The assistant should wait for the duration specified by min_endpointing_delay before sending the transcription to the LLM, assuming the user has finished speaking.
The intended flow should be:

User speaks
min_endpointing_delay seconds of silence
User is considered to have stopped speaking
Transcription is sent to the LLM
The LLM response is synthesized and returned via TTS

If this is not the intended behavior of the min_endpointing_delay parameter, the description should be updated for clarity. Additionally, is there a way to implement this behavior if it isn't currently supported?

Current Behavior

The assistant sends the transcription to the LLM immediately upon the user's first silence, without waiting for the duration specified by min_endpointing_delay.

Steps to Reproduce

Initialize the VoiceAssistant with min_endpointing_delay set to a large value (e.g., 10 seconds).
Start the assistant.
Speak into the assistant.
Observe from the logs that the LLM receives the input immediately, without waiting the specified delay.
[Optional] Use the before_llm_cb parameter to log information before sending input to the LLM. You will see that this callback is triggered immediately, without waiting the full 10 seconds.

Environment

Python 3.10.14
Packages:
- livekit==0.17.0
- livekit-agents==0.9.1
- livekit-api==0.7.0
- livekit-plugins-azure==0.3.2
- livekit-plugins-deepgram==0.6.7
- livekit-plugins-nltk==0.7.1
- livekit-plugins-openai==0.8.5
- livekit-plugins-silero==0.6.4
- livekit-protocol==0.6.0

Sep 28 '24 10:09 samirsalman

+1, We are getting this problem where agent is speaking before human ends their conversation (they are thinking, thus speaking slowly), It is very annoying as agent interrupts and start speaking at incomplete breaks.

Oct 01 '24 09:10 hari01584

+1

Oct 01 '24 12:10 SimoneFaricelli

@hari01584 have you tried increasing min_endpointing_delay ?

Oct 02 '24 05:10 davidzhao

TypeError: VoiceAssistant.init() got an unexpected keyword argument 'min_endpointing_delay'

Oct 07 '24 08:10 Aniket-think41

@davidzhao up

Oct 09 '24 13:10 samirsalman

Hey, this was a behavior of premptive_synthesis and it is now disabled by default on livekit-agents==0.10.1. You shouldn't see this behavior when preemptive_synthesis=False

Oct 10 '24 00:10 theomonnom