Voicemail detection plugin / extension similar to Pipecat's VoicemailDetector
We are running livekit agents in production across thousands of answered calls per day. However 30% of calls are either voicemail or iOS 26 Call Screening bots. This is worsening day by day as people are turning this feature on their phones on.
We need a reliable and repeatable way to implement these features across agents. Generic function calling based approaches just don't work in the real world.
Pipecat has released this VoicemailDetector plugin which uses a parallel LLM processing, to detect and gate voicemail.
https://docs.pipecat.ai/server/utilities/extensions/voicemail https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/extensions/voicemail/voicemail_detector.py
I would love to see similar implementation for Livekit Agents as well.
Some use cases:
- Normal voicemail (iOS / non-iOS) - should leave a message and terminate call
- iOS 26 Call Screening (Truecaller and Google Pixel also bringing this) - Should explain why we are calling but cut the call after a timeout
- Telecom saying "User has put your call on hold" repeatedly - cut the call after a timeout
Currently I am trying to do a custom implementation by overriding the llm_node to do the detect() and the tts_node to do the gate() functionalities, inspired from the pipecat functionality.
Generic function calling based approaches just don't work in the real world.
Do you have examples of where it failed to trigger the function call? what LLM are you using?
The example you've linked to is also using a LLM to make the same decision. It's unclear that it'll be any more effective.
I have already shared the use-cases in the issue. Btw, we are doing multi-lingual bots that work across 10 different Indian languages.
We run on llama-3.3-70b for some and gpt-oss-120b for newer bots.
Core Problems:
-
The bot doesn't listen to incoming audio before speaking its first message. This is mainly because the first message is not interruptible. So its almost always missing the voicemail audio.
-
When a voicemail is detected, how to handle is a complex orchestration. Its not just voicemail, there can be telecom busy tone audio "The person you are trying to reach has put your call on hold". All of these cases, the normal operation is to be stopped and parallel handling of the situation is needed. Once a human caller comes within a timeout, it should resume normal pipeline operation.
This requires custom coding, and the bot prompts are already complex enough. Giving this additional responsibilities to a 70b model just doesn't cut it.
Not saying that Pipecat way is the only good way, but custom orchestration is required to handle gracefully.
@arpanpreneur could you share your implementation by any chance?