[BUG] Audio track missing in LiveKit RoomComposite recording after 60 seconds
Describe the bug
Context:
We are using LiveKit cloud and telephony to build Voice AI agent using livekit/agents framework. We are also using LiveKit cloud's RoomComposite egress to record the conversations.
Bug: The audio is obtained in real-time correctly to the voice agent but the audio recording is not captured after 60 secs into the call for one of the tracks. To prove this - we have an audio recording from dialer, we can share the recordings of both recordings via Slack DM - can't share here as it contains PII data:
- LiveKit recording which does not capture one of the tracks correctly
- Recording from our dialer.
This has happened for 2 calls that we have noticed so far. Might be happening for other calls too that we are unaware of.
LiveKit Cloud Project ID: 3tqm7ro6kbs
Call 1:
- Room ID:
RM_SeDnfWigYLpH - Egress ID:
EG_GQsZwJEBCJLG
Call 2:
- Room ID:
RM_mYJKBVh8Q6MN - Egress ID:
EG_JyXnNGxp5M8Y
RoomComposite Request body for both the calls:
api.RoomCompositeEgressRequest(
room_name=room.name,
audio_only=True,
audio_mixing=DUAL_CHANNEL_AGENT,
file_outputs=[
api.EncodedFileOutput(
file_type=api.EncodedFileType.OGG,
filepath=file_path,
disable_manifest=True,
s3=api.S3Upload(
bucket=s3_bucket,
region=s3_region,
session_token=session_token,
access_key=access_key,
secret=secret_key,
),
)
],
webhooks=webhooks,
)
Egress Version Using LiveKit Cloud
Egress Request Shared code above on the parameters we use for RoomComposite request
Additional context Shared above
Logs Details shared above
https://github.com/livekit/egress/issues/952 - I had raised a similar issue which got closed without any answer.
Log analysis is showing that livekit server recorded relatively high RTP timestamp drift from the participant's track that disappeared from the recording. In order to ensure proper synchrnoization between streams egress service keeps live recording window, the RTP timestamp skew received from sip was such that the affected track felt out from the live recording window and never came back. Telephony folks are looking into it so that we get to the root cause of the track publishing drifting away. We can try building a workaround inside egress that will after some time detect that track felt behind and kind of forcefully pull it back to the present window - while that might impact inter-track syncrhonization a bit it will at least preserve the affected track in the recording.
@milos-lk - is it possible to add the workaround you mentioned to resolve this issue? This is a data loss and is a high priority issue for us. Appreciate your help here.
Hello @milos-lk - any update on this issue?
Hi @zaheerabbas-prodigal , looks like the remote end is sending media from two different IPs and switching SSRCs. That is causing this. It's not technically off spec but it usually leads to issues and livekit works better if that isnt the case. Any chance they can send media from the same IP during the session ?
Thanks for the response @nishadmusthafa 🙏
The above calls are from TCN dialer, and I’m not fully sure how their media routing is implemented. It’s quite possible they are sending media from different IP addresses, if the spec supports it.
- From your comment, the issue seems to be caused because the remote end is sending media from two different IPs and switching SSRCs during the session.
- In an earlier comment, Milos mentioned that there was a high RTP timestamp drift, which caused the track to fall out of the live recording window and never return.
Could you help clarify:
- Root cause: Is the primary issue for no consumer audio track the media being received from multiple IPs, or the RTP timestamp drift (or a combination of both)?
- Is there any timeline or ETA on LiveKit’s side to fix the root cause of the issue that caused here?
Even if handling multiple IPs isn’t supported right away, would it be possible to at least mitigate this on the recording side (e.g., by recovering or preserving the affected track instead of dropping it)? Even a partial workaround would be super helpful here, since this is a high-priority issue for us and results in data loss after 60 seconds for one of the tracks.
It is due to combination of both.
We are looking at a workaround on egress side. We do not have an ETA for that. But, we will update as soon as we have to something try.
@boks1971 - Thanks for the resposne. Is it possible to share the SIP PCAP and RTP PCAP files for the above 2 calls. This will help me check with TCN team on when the media data is sent from two different IP Addresses? Will help us narrow down on which cases this might cause issues.
@zaheerabbas-prodigal we don't have RTP PCAPs for privacy reasons. We have the sip pcap. Howver, I can't upload the pap here. Are you on open source slack? I'm nishad-lk. Can share the pcaps there
To clarify @zaheerabbas-prodigal we saw the problem happening through our logging
Hello team - any update on this issue?
Also, I had one clarification I wanted to check:
For the 2nd issue around RTP drift (where the RTP data is slightly delayed) — is the delay being introduced because TCN is sending the RTP packets in a delayed manner, or is it something happening on the LiveKit side during processing?
Basically, I’m trying to understand where the delay originates — the source (TCN) or the receiving/processing end (LiveKit).
For the first issue,
remote end is sending media from two different IPs and switching SSRCs
Is it possible to check how frequently does the media come from two different IPs please?
This would help us a lot in how we frame the discussion with TCN.
@nishadmusthafa - I have raised a support ticket as per our discussions on Slack.
@nishadmusthafa @milos-lk - We saw this again happen for non-tcn source too. Can we please check the reason for the missing recording here? Is this a different issue than the one being reported here
The below call did NOT have user track recording after 30 secs into the call. The agent track is correctly recorded
Project ID - p_4sen7zyruwn
Room ID - RM_ttokX9qjFE9J
Egress ID - EG_VqCbnce99Uz2
The below call had missing recording for user track in between the call - the recording is missing between 30 sec and 4 mins. The agent recording is correctly recorded
Project ID - p_4sen7zyruwn
Room ID - RM_m8Xw4fQHM2dM
Egress ID - EG_wYKFkjshVAgS
@milos-lk - This has happened for one more call from TCN source. Is this the same issue here? The recording of user track stopped after 90 secs.
Project ID - p_3tqm7ro6kbs
Room ID - RM_Ziyf9K9MPciR
Egress ID - EG_3oFMRZrafQ8T
Egress workaround has been applied - egress will now detect tracks falling behind and pull them forward. While not great, it's the best effort to preserve data.
Thanks for the update Milos
Hello team, just checking in to see if there is any update on fixing this issue at the SIP side here.