mediasoup icon indicating copy to clipboard operation
mediasoup copied to clipboard

Fix handling of padding only packets

Open ggarber opened this issue 3 years ago • 21 comments

WebRTC uses padding-only packets for testing available bandwidth specially when using simulcast (to test if it is possible to enable a higher layer).

With the current handling in mediasoup those packets are: 1/ Forwarded to the consumers (wasting bandwidth) 2/ Included in the bitrate calculations of a layer. That means: a) The spatial layer is marked as active even if there is no media but only padding associated with it b) The bitrate of the spatial layer is estimated as higher than actually needed

Because of those issues there are cases where the SimulcastConsumer keeps sending the higher layer even if it only has padding bytes and that makes the video freeze in the receiver side because there are no new real frames received.

We have received different customer reports of video freezing for long periods that look associated to this issue.

With this PR: 1/ Padding-only packets are never forwarded 2/ Padding only packets are not included in bitrate counters and neither to mark a RtpStream as active

As far as I can tell this is only an issue if RTX is disabled but I didn't investigate that much.

ggarber avatar Mar 12 '22 21:03 ggarber

I'll fully review tomorrow, but I miss similar changes in SimpleConsumer and DirectConsumer for consistency.

Thx a looot Iñaki! I just addressed the other comments but what is a DirectConsumer and where is the code? I apologize for the probably stupid question.

ggarber avatar Mar 12 '22 21:03 ggarber

Just wondering why just not discard the packet on the transport itself. Any reason no to simply do that?

diff --git a/worker/src/RTC/Transport.cpp b/worker/src/RTC/Transport.cpp
index b6cf99740..52e6138b5 100644
--- a/worker/src/RTC/Transport.cpp
+++ b/worker/src/RTC/Transport.cpp
@@ -1685,6 +1685,14 @@ namespace RTC
                        return;
                }

+               // Ignore padding only packets as they won't be forwarded.
+               if (packet->GetPayloadLength() > 0)
+               {
+                       MS_DEBUG_DEV(rtp, "ignoring padding only packet [ssrc:%" PRIu32 "]", packet->GetSsrc());
+
+                       delete packet;
+               }
+
                // MS_DEBUG_DEV(
                //   "RTP packet received [ssrc:%" PRIu32 ", payloadType:%" PRIu8 ", producerId:%s]",
                //   packet->GetSsrc(),

jmillan avatar Mar 13 '22 06:03 jmillan

Just wondering why just not discard the packet on the transport itself. Any reason no to simply do that?

Auto response: In order to update the transmission counters in the RtpStreamRecv.

These packets should never reach to consumers, hence, should be ignored right after RtpStreamRecv processes them:

diff --git a/worker/src/RTC/Producer.cpp b/worker/src/RTC/Producer.cpp
index 603d9410b..ee3ddd3ff 100644
--- a/worker/src/RTC/Producer.cpp
+++ b/worker/src/RTC/Producer.cpp
@@ -657,6 +657,14 @@ namespace RTC
                        MS_ABORT("found stream does not match received packet");
                }

+               // Ignore padding only packets as they won't be forwarded.
+               if (packet->GetPayloadLength() > 0)
+               {
+                       MS_DEBUG_DEV(rtp, "ignoring padding only packet [ssrc:%" PRIu32 "]", packet->GetSsrc());
+
+                       return ReceiveRtpPacketResult::IGNORE;
+               }
+
                if (packet->IsKeyFrame())
                {
                        MS_DEBUG_TAG(
diff --git a/worker/src/RTC/Transport.cpp b/worker/src/RTC/Transport.cpp
index b6cf99740..323d4f25f 100644
--- a/worker/src/RTC/Transport.cpp
+++ b/worker/src/RTC/Transport.cpp
@@ -1706,6 +1706,8 @@ namespace RTC
                                // Tell the child class to remove this SSRC.
                                RecvStreamClosed(packet->GetSsrc());
                                break;
+                       case RTC::Producer::ReceiveRtpPacketResult::IGNORE:
+                               break;
                        default:;
                }

jmillan avatar Mar 13 '22 07:03 jmillan

Just wondering why just not discard the packet on the transport itself. Any reason no to simply do that?

Auto response: In order to update the transmission counters in the RtpStreamRecv.

These packets should never reach to consumers, hence, should be ignored right after RtpStreamRecv processes them:

diff --git a/worker/src/RTC/Producer.cpp b/worker/src/RTC/Producer.cpp
index 603d9410b..ee3ddd3ff 100644
--- a/worker/src/RTC/Producer.cpp
+++ b/worker/src/RTC/Producer.cpp
@@ -657,6 +657,14 @@ namespace RTC
                        MS_ABORT("found stream does not match received packet");
                }

+               // Ignore padding only packets as they won't be forwarded.
+               if (packet->GetPayloadLength() > 0)
+               {
+                       MS_DEBUG_DEV(rtp, "ignoring padding only packet [ssrc:%" PRIu32 "]", packet->GetSsrc());
+
+                       return ReceiveRtpPacketResult::IGNORE;
+               }
+
                if (packet->IsKeyFrame())
                {
                        MS_DEBUG_TAG(
diff --git a/worker/src/RTC/Transport.cpp b/worker/src/RTC/Transport.cpp
index b6cf99740..323d4f25f 100644
--- a/worker/src/RTC/Transport.cpp
+++ b/worker/src/RTC/Transport.cpp
@@ -1706,6 +1706,8 @@ namespace RTC
                                // Tell the child class to remove this SSRC.
                                RecvStreamClosed(packet->GetSsrc());
                                break;
+                       case RTC::Producer::ReceiveRtpPacketResult::IGNORE:
+                               break;
                        default:;
                }

That looks better but i was worried about the gaps in the sequence numbers that discarding the packet before the Consumer could create.

Is it ok if the consumer receives a packet with seqnum X and the next one is seqnum X+2 ?

ggarber avatar Mar 13 '22 10:03 ggarber

Is it ok if the consumer receives a packet with seqnum X and the next one is seqnum X+2 ?

I think so, it already happen with UDP sometimes anyway?

nazar-pc avatar Mar 13 '22 20:03 nazar-pc

That looks better but i was worried about the gaps in the sequence numbers that discarding the packet before the Consumer could create.

That's right @ggarber. We need to consider that in Consumers otherwise Endpoints may end up sending NACK requests that will never be satisfied. Also, those gaps may result also in frozen video which is not desirable.

@ggarber, what is the nature of those all-padding RTP packets? Does libwebrtc send them in a legit media stream?

jmillan avatar Mar 14 '22 14:03 jmillan

That looks better but i was worried about the gaps in the sequence numbers that discarding the packet before the Consumer could create.

That's right @ggarber. We need to consider that in Consumers otherwise Endpoints may end up sending NACK requests that will never be satisfied. Also, those gaps may result also in frozen video which is not desirable.

@ggarber, what is the nature of those all-padding RTP packets? Does libwebrtc send them in a legit media stream?

Yes, they are sent in a legit media stream. Libwebrtc uses them to test if it can start sending a higher spatial layer. For example if your bandwidth estimation is 300kbps and the lowest layer is 200kbps and to activate the next layer requires 350 kbps it will send some padding in the highest layer to try to get a higher bandwidth estimation. I think it is the same or very similar as the mediasoup probation but happening in the media ssrc instead of a different one.

ggarber avatar Mar 14 '22 18:03 ggarber

If we don't relay those payload-empty packets to consumers we are gonna generate NACKs and hence frozen video. Do I miss something?

ibc avatar Mar 14 '22 18:03 ibc

If we don't relay those payload-empty packets to consumers we are gonna generate NACKs and hence frozen video. Do I miss something?

They need to be filtered like when you filter temporal layers (no seqnum gaps). That's why I do the this->rtpSeqManager.Drop(packet->GetSequenceNumber()); that i'm assuming does that.

ggarber avatar Mar 15 '22 11:03 ggarber

And just to clarify it this is not a theoretical issue or just an optimization. We have several real users reports experiencing video freezing for more than 1 minute when simulcast is used, rtx is not enabled and their bandwidth estimation is ~300kbps. We think this is the reason (the higher layer stays active but sending only padding) as it matches the logs/metrics we get from those users.

Although a valid answer is "we don't care about scenarios with rtx disabled".

ggarber avatar Mar 15 '22 11:03 ggarber

If we don't relay those payload-empty packets to consumers we are gonna generate NACKs and hence frozen video. Do I miss something?

They need to be filtered like when you filter temporal layers (no seqnum gaps). That's why I do the this->rtpSeqManager.Drop(packet->GetSequenceNumber()); that i'm assuming does that.

Right. @jmillan may you please confirm this is ok?

ibc avatar Mar 15 '22 11:03 ibc

Regarding DirectConsumer: sorry., such a thing doesn't exist. I meant DirectTransport but it's unrelated here.

ibc avatar Mar 15 '22 11:03 ibc

make format missing in worker/ folder.

ibc avatar Mar 15 '22 11:03 ibc

They need to be filtered like when you filter temporal layers (no seqnum gaps). That's why I do the this->rtpSeqManager.Drop(packet->GetSequenceNumber()); that i'm assuming does that.

That's correct :+1:

jmillan avatar Mar 15 '22 15:03 jmillan

Please @ggarber, assign me for review after the requested changes are done.

jmillan avatar Mar 15 '22 15:03 jmillan

Please @ggarber, assign me for review after the requested changes are done.

@jmillan I think I don't have permissions to assign reviewers but it is ready for whenever you have time to review it. Thx a lot as usual.

ggarber avatar Mar 30 '22 20:03 ggarber

compilation is failing

jmillan avatar Mar 31 '22 06:03 jmillan

Re-reviewing this PR with @jmillan tomorrow.

ibc avatar Apr 20 '22 09:04 ibc

@ggarber, friendly ping.

jmillan avatar May 25 '22 10:05 jmillan

Hi, I'm seeing similar freezings still in production (even with RTX enabled) so it looks like the issue is bigger than what I thought and I'm back at trying to land it.

The new cases I've seen are apparently when the producer is resized and the number of spatial layers go from 2 to 1. I think in that case libwebrtc is also using padding some times for the disabled layer.

I think I have addressed all the pending comments. Let me know if something else is missing from me here. Thank you!

ggarber avatar Jan 25 '23 15:01 ggarber

I see a problem here. I may be wrong. We are dropping those packets on receipt but are those dropped packet account for Producer stream bitrate? Because, if so, when the XxxConsumer checks the current bitrate of the Producer stream (to see if it can forward this stream to the receiver due to BWE) the XxxxConsumer would get a bitrate value higher than it should be. As I said I may be wrong but worth checking it.

That may indeed be the cause for the issue @ggarber was seeing here https://github.com/versatica/mediasoup/pull/793#issuecomment-1403828446.

The only missing thing here is to not account padding-only packets for increasing RTP data counters nor for activating the streams.

This patch would do it @ggarber

diff --git a/worker/src/RTC/RtpStreamRecv.cpp b/worker/src/RTC/RtpStreamRecv.cpp
index ba9e50c6d..89c00e3b7 100644
--- a/worker/src/RTC/RtpStreamRecv.cpp
+++ b/worker/src/RTC/RtpStreamRecv.cpp
@@ -303,6 +303,13 @@ namespace RTC
                // Calculate Jitter.
                CalculateJitter(packet->GetTimestamp());

+               // Padding only packet, do not consider it for counter increase nor
+               // stream activation.
+               if (packet->GetPayloadLength() == 0)
+               {
+                       return true;
+               }
+
                // Increase transmission counter.
                this->transmissionCounter.Update(packet);

@@ -413,6 +420,13 @@ namespace RTC
                // NACKed packet.
                if (this->nackGenerator->ReceivePacket(packet, /*isRecovered*/ true))
                {
+                       // Padding only packet, do not consider it for counter increase nor
+                       // stream activation.
+                       if (packet->GetPayloadLength() == 0)
+                       {
+                               return true;
+                       }
+
                        // Mark the packet as repaired.
                        RTC::RtpStream::PacketRepaired(packet);

I believe this PR would be complete with this missing part.

jmillan avatar Jan 17 '24 16:01 jmillan