agents icon indicating copy to clipboard operation
agents copied to clipboard

Sometimes VAD Freezes/Unable to detect speech end

Open hari01584 opened this issue 1 year ago • 16 comments

We are encountering a very weird bug where sometimes in conversation, VAD/Agent cannot detect end of speech.. the usual flow is that user starts speaking and we can get live transcripts for it but when they stop there is no OpenAI call or subsequent agent response (even after waiting for long time), the solution here is to add on something more (user speaks one or two more words and then again stop), then it detects the whole speech segment and now agent replies back.

Very peculiar but annoying bug, Please fix

hari01584 avatar Oct 04 '24 12:10 hari01584

We're experiencing this too.

martin-purplefish avatar Oct 04 '24 17:10 martin-purplefish

it's strange that I still get the user finished speaking events, but the message never comes.

martin-purplefish avatar Oct 04 '24 20:10 martin-purplefish

Indeed, my team is creating automated interviewing solutions and we are rigorously using live kit, there is few bugs popping up but overall this is amazing

hari01584 avatar Oct 04 '24 20:10 hari01584

Hey, I just took a look at the VAD implementation and I'm quite sure it isn't coming from it.

So I think something froze inside the VoicePipelineAgent?

Can you share the debug logs of your agent when this is happening? It would be useful to track where the pipeline hangs.

theomonnom avatar Oct 04 '24 20:10 theomonnom

@theomonnom I'll get some debug logs, but here are some info level ones:

{"message": "Agent state thinking -> speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:32.386400+00:00"}
{"message": "Agent stopped speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:38.647632+00:00"}
{"message": "logging agent message to backend", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:38.648797+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'listening'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:38.977859+00:00"}
{"message": "participant: rtc.LocalParticipant(sid=PA_QzsLJZ9CtTo3, identity=agent-AJ_6Z4b4mTfh6eR, name=)", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:38.977885+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'listening'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:38.977901+00:00"}
{"message": "Agent state speaking -> listening", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:38.977914+00:00"}
{"message": "User started speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:39.767192+00:00"}
{"message": "User stopped speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:40.309368+00:00"}
{"message": "Silence timeout", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:46.082361+00:00"}
{"message": "Agent started speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:53.329993+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'speaking'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:53.351021+00:00"}
{"message": "participant: rtc.LocalParticipant(sid=PA_QzsLJZ9CtTo3, identity=agent-AJ_6Z4b4mTfh6eR, name=)", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 23642, "job_id": "AJ_6Z4b4mTfh6eR", "timestamp": "2024-10-04T21:10:53.351101+00:00"}

The user started speaking/agent started speaking are the event handlers (assistant.on('user_started_speaking') etc)

The silence timeout is a 5 second timeout I have that gets called after 5 seconds of silence. After the agent's silence timeout (note it says user stopped speaking), the agent still didn't know about the user's message.

In the agent playground, the message still appears with an ellipses and when I keep talking it completes it above the agent's response (so I see something like:

Me: Hello...
Agent: Sorry are you there?

(from my timeout logic)

Then I keep talking:

Me: Hello. Yes I'm here.
Agent: Sorry are you there?

Then finally the agent gets the full response:

Agent: Sorry are you there?
Me: Hello. Yes I'm here.

martin-purplefish avatar Oct 04 '24 21:10 martin-purplefish

@theomonnom here's another one:

{"message": "synthesizing agent reply", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_transcript": "school graduate.", "speech_id": "312d2586ff54", "elapsed": 2.702, "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:22.675932+00:00"}
{"message": "first LLM token", "level": "DEBUG", "name": "livekit.agents.pipeline", "speech_id": "bec352999b1a", "elapsed": 0.443, "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:23.077241+00:00"}
{"message": "first LLM token", "level": "DEBUG", "name": "livekit.agents.pipeline", "speech_id": "312d2586ff54", "elapsed": 0.498, "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:23.174693+00:00"}
{"message": "first TTS frame", "level": "DEBUG", "name": "livekit.agents.pipeline", "speech_id": "bec352999b1a", "elapsed": 1.097, "streamed": true, "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:24.176101+00:00"}
{"message": "started playing the first frame", "level": "DEBUG", "name": "livekit.agents.pipeline", "speech_id": "bec352999b1a", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:24.176384+00:00"}
{"message": "Agent started speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:24.176455+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'speaking'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:24.196528+00:00"}
{"message": "participant: rtc.LocalParticipant(sid=PA_pHNdhKh2rCu9, identity=agent-AJ_iP27DxV2BajX, name=)", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:24.196603+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'speaking'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:24.196625+00:00"}
{"message": "Agent state thinking -> speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:24.196641+00:00"}
{"message": "first TTS frame", "level": "DEBUG", "name": "livekit.agents.pipeline", "speech_id": "312d2586ff54", "elapsed": 1.11, "streamed": true, "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:24.284736+00:00"}
{"message": "Agent stopped speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:28.817733+00:00"}
{"message": "playout finished", "level": "DEBUG", "name": "livekit.agents.pipeline", "speech_id": "bec352999b1a", "interrupted": false, "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:28.820274+00:00"}
{"message": "committed agent speech", "level": "DEBUG", "name": "livekit.agents.pipeline", "agent_transcript": " <agent transcirpt>", "interrupted": false, "speech_id": "bec352999b1a", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:29.056301+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'listening'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:29.056529+00:00"}
{"message": "participant: rtc.LocalParticipant(sid=PA_pHNdhKh2rCu9, identity=agent-AJ_iP27DxV2BajX, name=)", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:29.056566+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'listening'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:29.056587+00:00"}
{"message": "Agent state speaking -> listening", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:29.056601+00:00"}
{"message": "User started speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:30.661539+00:00"}
{"message": "User stopped speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:31.241238+00:00"}
{"message": "validated agent reply", "level": "DEBUG", "name": "livekit.agents.pipeline", "speech_id": "312d2586ff54", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:32.787292+00:00"}
{"message": "committed user transcript", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_transcript": "<commited transcript>", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:32.787785+00:00"}
{"message": "started playing the first frame", "level": "DEBUG", "name": "livekit.agents.pipeline", "speech_id": "312d2586ff54", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:32.998715+00:00"}
{"message": "Agent started speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:32.998815+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'speaking'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:33.020321+00:00"}
{"message": "participant: rtc.LocalParticipant(sid=PA_pHNdhKh2rCu9, identity=agent-AJ_iP27DxV2BajX, name=)", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:33.020402+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'speaking'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:33.020434+00:00"}
{"message": "Agent state listening -> speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:33.020458+00:00"}
{"message": "Agent stopped speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:37.759998+00:00"}
{"message": "playout finished", "level": "DEBUG", "name": "livekit.agents.pipeline", "speech_id": "312d2586ff54", "interrupted": false, "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:37.762560+00:00"}
{"message": "committed agent speech", "level": "DEBUG", "name": "livekit.agents.pipeline", "agent_transcript": "<committed transcript>", "interrupted": false, "speech_id": "312d2586ff54", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:37.940394+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'listening'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:37.940600+00:00"}
{"message": "participant: rtc.LocalParticipant(sid=PA_pHNdhKh2rCu9, identity=agent-AJ_iP27DxV2BajX, name=)", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:37.940630+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'listening'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:37.940649+00:00"}
{"message": "Agent state speaking -> listening", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:37.940663+00:00"}
{"message": "User started speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:39.211851+00:00"}
{"message": "User stopped speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:39.813066+00:00"}
{"message": "validated agent reply", "level": "DEBUG", "name": "livekit.agents.pipeline", "speech_id": "456a8f63bf42", "pid": 24696, "job_id": "AJ_iP27DxV2BajX", "timestamp": "2024-10-04T21:41:41.086804+00:00"}

martin-purplefish avatar Oct 04 '24 21:10 martin-purplefish

https://github.com/livekit/agents/blob/ab0f446cb0eebd770f820dade1e168cad1503fdd/livekit-agents/livekit/agents/pipeline/pipeline_agent.py#L911

Could it be here? I feel like that 1.0 should be delay (just poking around)

martin-purplefish avatar Oct 04 '24 22:10 martin-purplefish

I added a bunch of logging - it seems to be getting stuck in _commit_user_question_if_needed. I think because it commits before this check?

{"message": "Committing user transcript", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_transcript": "I got a car.", "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:19.326403+00:00"}
{"message": "Agent state listening -> speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:19.534668+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:20.014278+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:20.014572+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:20.515413+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:20.515586+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:21.016198+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:21.016333+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:21.517238+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:21.517434+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:22.017888+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:22.018884+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:22.519314+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:22.519444+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:23.020008+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:23.020445+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:23.520987+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:23.521194+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:24.021886+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:24.022096+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:24.522955+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:24.523102+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:25.024159+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:25.024238+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:25.525787+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:25.525994+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.026618+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.026950+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.527442+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.527633+00:00"}
{"message": "Agent stopped speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.675035+00:00"}
{"message": "playout finished", "level": "DEBUG", "name": "livekit.agents.pipeline", "speech_id": "6b4f4f352660", "interrupted": false, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.675941+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.676034+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.676068+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.676086+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "I got a car.", "interrupted": false, "user_committed": true, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.676145+00:00"}
{"message": "https://app.purplefish.com:443 \"POST /api/engagements/cm1vaq4c5005qta4wtb4gnssw/messages HTTP/11\" 201 None", "level": "DEBUG", "name": "urllib3.connectionpool", "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.856574+00:00"}
{"message": "committed agent speech", "level": "DEBUG", "name": "livekit.agents.pipeline", "agent_transcript": "<agent speech>", "interrupted": false, "speech_id": "6b4f4f352660", "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.856988+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'listening'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.857175+00:00"}
{"message": "participant: rtc.LocalParticipant(sid=PA_mSQQdiSUFyHr, identity=agent-AJ_Wk6PhiMXZFnv, name=)", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.857210+00:00"}
{"message": "Participant attributes: {'lk.agent.state': 'listening'}", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.857237+00:00"}
{"message": "Agent state speaking -> listening", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:26.857250+00:00"}
{"message": "Silence timeout", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:32.208893+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "", "interrupted": false, "user_committed": false, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:32.210598+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "", "interrupted": false, "user_committed": false, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:32.210675+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "", "interrupted": false, "user_committed": false, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:32.711407+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "", "interrupted": false, "user_committed": false, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:32.711620+00:00"}
{"message": "User started speaking", "level": "INFO", "name": "livekit.agents.pipeline", "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:32.987479+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "", "interrupted": false, "user_committed": false, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:33.212177+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "", "interrupted": false, "user_committed": false, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:33.212252+00:00"}
{"message": "playout finished", "level": "DEBUG", "name": "livekit.agents.pipeline", "speech_id": "045a914759a2", "interrupted": false, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:33.401319+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "", "interrupted": false, "user_committed": false, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:33.401407+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "", "interrupted": false, "user_committed": false, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:33.401432+00:00"}
{"message": "Checking if user question needs to be committed", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "", "interrupted": false, "user_committed": false, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:33.401446+00:00"}
{"message": "User question not committed: conditions not met", "level": "DEBUG", "name": "livekit.agents.pipeline", "user_question": "", "interrupted": false, "user_committed": false, "pid": 25845, "job_id": "AJ_Wk6PhiMXZFnv", "timestamp": "2024-10-04T22:32:33.401457+00:00"}

martin-purplefish avatar Oct 04 '24 22:10 martin-purplefish

mmh interesting, tho the playout should have been started before the user question is committed. So you should have at least heard an answer

theomonnom avatar Oct 04 '24 22:10 theomonnom

I think it has something to do with interruption handling. It seems more likely if I start talking right as the agent is about to finish.

martin-purplefish avatar Oct 04 '24 23:10 martin-purplefish

It seems to be when there are two user messages being committed - the first one gets committed, but the second gets stuck. Check out this (they're reversed so read from the bottom upl

So the "Yeah." which I said first gets committed, but the "Sure that works." hangs until I reply again. And then it gets appended to the outstanding one. So I think it's... only one message gets committed at a time?

Could it be something like having multiple outstanding human transcripts to submit, and maybe the agent starts talking?

{
  "message": "User question not committed: conditions not met",
  "level": "DEBUG",
  "name": "livekit.agents.pipeline",
  "user_question": "Yeah.",
  "interrupted": false,
  "user_committed": true,
  "pid": 344,
  "job_id": "AJ_BC6jbvzYeVL7",
  "timestamp": "2024-10-05T01:22:06.116402+00:00"
}
{
  "message": "User question committed successfully",
  "level": "DEBUG",
  "name": "livekit.agents.pipeline",
  "pid": 344,
  "job_id": "AJ_BC6jbvzYeVL7",
  "timestamp": "2024-10-05T01:22:05.114075+00:00"
}
{
  "message": "Committing user transcript",
  "level": "DEBUG",
  "name": "livekit.agents.pipeline",
  "user_transcript": "Yeah.",
  "pid": 344,
  "job_id": "AJ_BC6jbvzYeVL7",
  "timestamp": "2024-10-05T01:22:04.950239+00:00"
}
{
  "message": "Checking if user question needs to be committed",
  "level": "DEBUG",
  "name": "livekit.agents.pipeline",
  "user_question": "Yeah.",
  "interrupted": false,
  "user_committed": false,
  "pid": 344,
  "job_id": "AJ_BC6jbvzYeVL7",
  "timestamp": "2024-10-05T01:22:04.950128+00:00"
}
{
  "message": "validated agent reply",
  "level": "DEBUG",
  "name": "livekit.agents.pipeline",
  "speech_id": "6279ab903224",
  "pid": 344,
  "job_id": "AJ_BC6jbvzYeVL7",
  "timestamp": "2024-10-05T01:22:04.949863+00:00"
}
{
  "message": "Received end of speech, validating reply",
  "level": "DEBUG",
  "name": "livekit.agents.pipeline",
  "transcript": "Sure that works.",
  "pid": 344,
  "job_id": "AJ_BC6jbvzYeVL7",
  "timestamp": "2024-10-05T01:22:04.573735+00:00"
}
{
  "message": "User stopped speaking",
  "level": "INFO",
  "name": "livekit.agents.pipeline",
  "pid": 344,
  "job_id": "AJ_BC6jbvzYeVL7",
  "timestamp": "2024-10-05T01:22:04.573610+00:00"
}
{
  "message": "Not validating reply: human is still speaking",
  "level": "DEBUG",
  "name": "livekit.agents.pipeline",
  "transcript": "Sure that works.",
  "pid": 344,
  "job_id": "AJ_BC6jbvzYeVL7",
  "timestamp": "2024-10-05T01:22:04.511658+00:00"
}
{
  "message": "User started speaking",
  "level": "INFO",
  "name": "livekit.agents.pipeline",
  "pid": 344,
  "job_id": "AJ_BC6jbvzYeVL7",
  "timestamp": "2024-10-05T01:22:02.492610+00:00"
}

martin-purplefish avatar Oct 05 '24 01:10 martin-purplefish

Another hypothesis:

  1. The AI synthesis kicks off but latency is high so it takes a while to get the first token.
  2. The user talks and finishes before the agent speaks.
  3. The agent speaks (in response to the previous message, having not seen what the user said). The user's message in 2 is "stuck open".
  4. The user repeats themselves, and that causes their outstanding message to get committed.

martin-purplefish avatar Oct 05 '24 02:10 martin-purplefish

@theomonnom As per my debugging, this is difficult to reproduce but a common pattern/issue I think is due to STT (I am using deepgram)

I think it happens due to un-certainty of text to be transcribed, I was able to somewhat consistently reproduce this by saying a confusing word then making loud noises immediately, then my existing sentences is shown as " ...", it keeps being there until I say something else

probably STT being uncertain what next to come? One pattern also.. You can say exact different sentence and your existing word/sentence (before being stuck), can be replaced/changed with better now-spoken version

hari01584 avatar Oct 05 '24 04:10 hari01584

@theomonnom what seems to happen is that _play_speech is waiting with an empty transcript, while there's an actual transcript behind it. so until the user says something that causes the empty transcript to get committed, it's waiting in the _speech_q

martin-purplefish avatar Oct 05 '24 10:10 martin-purplefish

This is weird, it basically means that Deepgram is never sending final transcripts. Only interim transcripts

theomonnom avatar Oct 07 '24 01:10 theomonnom

That would make a lot of sense, I will try changing providers to Azure maybe and report If it fixes anything

hari01584 avatar Oct 07 '24 01:10 hari01584

EDIT: We do not receive this problem after switching to Azure STT

hari01584 avatar Feb 04 '25 07:02 hari01584