client-sdk-js icon indicating copy to clipboard operation
client-sdk-js copied to clipboard

Failed to execute 'setRemoteDescription' on 'RTCPeerConnection': The RTCPeerConnection's signalingState is 'closed'.

Open HermanBilous opened this issue 1 year ago • 17 comments

Describe the bug

I have this issue that I cannot reproduce at all, but get a lot of errors in sentry for. Could you please take a look and point in a direction to how to fix it. Here's the log I can extract from sentry.

Reproduction

N/A

Logs

[
  {
    "level": "warning",
    "data": {
      "arguments": [
        "websocket closed",
        {
          "code": 1006,
          "identity": "<user_id>",
          "reason": "",
          "room": "room_b",
          "roomSid": "room_sid_b",
          "state": 1,
          "wasClean": false
        }
      ]
    }
  },
  {
    "level": "warning",
    "data": {
      "arguments": [
        "signal disconnected",
        {
          "identity": "<user_id>",
          "room": "room_b",
          "roomSid": "room_sid_b"
        }
      ]
    }
  },
  {
    "level": "info",
    "data": {
      "arguments": [
        "resuming signal connection, attempt 0",
        {
          "identity": "<user_id>",
          "room": "room_b",
          "roomSid": "room_sid_b"
        }
      ]
    }
  },
  {
    "level": "info",
    "data": {
      "arguments": [
        "Resuming signal connection",
        {
          "identity": "<user_id>",
          "room": "room_b",
          "roomSid": "room_sid_b"
        }
      ]
    }
  },
  {
    "level": "error",
    "data": {
      "arguments": [
        "Failed to execute 'setConfiguration' on 'RTCPeerConnection': The RTCPeerConnection's signalingState is 'closed'.",
        {
          "error": "[DOMException]",
          "identity": "<user_id>",
          "room": "room_b",
          "roomSid": "room_sid_b"
        }
      ]
    }
  },
  {
    "level": "warning",
    "data": {
      "arguments": [
        "reconnect disconnected",
        {
          "identity": "<user_id>",
          "room": "room_b",
          "roomSid": "room_sid_b"
        }
      ]
    }
  },
  {
    "level": "error",
    "data": {
      "arguments": [
        "unable to set offer",
        {
          "fields": "[Object]",
          "identity": "<user_id>",
          "room": "room_b",
          "roomSid": "room_sid_b"
        }
      ]
    }
  }
]


### System Info

```shell
Browser: any, but mostly Chrome
OS: Windows / MacOS

Severity

annoyance

Additional Information

No response

HermanBilous avatar Feb 28 '24 10:02 HermanBilous

thanks for the report! which SDK version is this? Did it start popping up after a specific upgrade?

It looks like the PeerConnection went into a closed state without that triggering a reconnect (which it should). On a signal resume the SDK then tries to update the PeerConnection's config, but that fails as it is already closed. Any logs that happen before the one's you shared might be helpful here as well

lukasIO avatar Feb 28 '24 11:02 lukasIO

I'm using 2.0.3 on dev and 1.15.10 on prod. I can see this issue reported on both environments. We had our sentry disabled for some time, but I remember having this issue on older versions of sdk as well.

We also get the same amount of errors like: InvalidStateError: Failed to execute 'addIceCandidate' on 'RTCPeerConnection': The RTCPeerConnection's signalingState is 'closed'. reported as well.

I found this code:

      if (e instanceof UnexpectedConnectionState) {
        this.log.debug('received unrecoverable error', { ...this.logContext, error: e });
        // unrecoverable
        recoverable = false;
      } else if (!(e instanceof SignalReconnectError)) {
        // cannot resume
        this.fullReconnectOnNext = true;
      }

It is executed when the Failed to execute 'setRemoteDescription'... error occurs. Judging by the reported error, the code above receives e of type SignalReconnectError, so the error is considered recoverable.

I'll dig for more logs.

HermanBilous avatar Feb 28 '24 12:02 HermanBilous

Thanks for the insight, that means that the clients to perform a reconnect then after hitting that error?

lukasIO avatar Feb 28 '24 12:02 lukasIO

If it's not full reconnect, it will only do a signal reconnect. Maybe a pc reconnect at this point is better?

Nevertheless, I could not find code that will do anything when onSignalingStatechange is triggered. I mean, it calls PCTransportManager.updateState, but then in mentioned function nothing gets done about changes in pc.signalingState. Maybe that's why it never triggers a reconnect?

I looked into logs from other events, but they all have the same error pattern from original post in common.

HermanBilous avatar Feb 29 '24 07:02 HermanBilous

Yeah, you're totally right. This code path failing should not be treated as a SignalReconnectError and instead force a full reconnect. Thanks for digging into this!

lukasIO avatar Feb 29 '24 09:02 lukasIO

Thanks for the PR. I will update my project once it's released, and will report back if it fixed my problem.

HermanBilous avatar Feb 29 '24 13:02 HermanBilous

So, the changes did not help, I think they made the issue even worse, since now I get 3 unhandled exceptions instead of 1. I created #1063 to at least add a catch clause on those errors. This should at least make the errors handled and not be reported by sentry, which helps my case. This won't help with the issue itself, however.

HermanBilous avatar Mar 05 '24 15:03 HermanBilous

Hey, I think we should close this one for now.. I thought that the fix went into the latest release somehow, I even read the release notes, and I swear I saw that PR in there.

HermanBilous avatar Mar 06 '24 07:03 HermanBilous

were you able to test the changes in the latest release?

lukasIO avatar Mar 07 '24 11:03 lukasIO

yes, the error is still present. From error trace I can see that it's coming from callbacks added in setupSignalClientCallbacks()

HermanBilous avatar Mar 07 '24 11:03 HermanBilous

could you provide the exact error and stack trace of an error that was raised in the version released yesterday (v2.0.5) ?

lukasIO avatar Mar 07 '24 11:03 lukasIO

The exact stack trace looks like this:

Error: Failed to execute 'setRemoteDescription' on 'RTCPeerConnection': The RTCPeerConnection's signalingState is 'closed'.
  at ss.<anonymous>(/assets/lk-Bsa7hxT1.js:1:214013)
  at Generator.throw(<anonymous>)
  at c(/assets/lk-Bsa7hxT1.js:1:61326)

As you can see, not much of a stacktrace on this error, because ts wrapped it in a generator. If I look into code, I see this at where the error was thrown:

ject.assign(Object.assign({},this.logContext),{fields:o})),new Hn(s)}})}}function os(i,e,t){let n=0;i.rtp.some(r=>r.codec==="opus"?(n=r.payl

Which leads me to: PCTransport.setMungedSDP throw new NegotiationError(msg);.

Looking into livekit's console logs:

websocket closed {
code: 1006,
participant: pid_1,
pID: PA_QgjpAGj2QoVn,
reason: ,
room: rid,
roomID: RM_3jcnGDm7UKM7,
state: 1,
wasClean: false
}
signal disconnected {
participant: pid_1,
pID: PA_QgjpAGj2QoVn,
room: rid,
roomID: RM_3jcnGDm7UKM7
}
resuming signal connection, attempt 0 {
participant: pid_1,
pID: PA_QgjpAGj2QoVn,
room: rid,
roomID: RM_3jcnGDm7UKM7
}
Resuming signal connection {
participant: pid_1,
pID: PA_QgjpAGj2QoVn,
room: rid,
roomID: RM_3jcnGDm7UKM7
}
reconnect disconnected {
participant: pid_1,
pID: PA_QgjpAGj2QoVn,
room: rid,
roomID: RM_3jcnGDm7UKM7
}
unable to set offer {
fields: [Object],
participant: pid_1,
pID: PA_QgjpAGj2QoVn,
room: rid,
roomID: RM_3jcnGDm7UKM7
}

Then

Error: Failed to execute 'setRemoteDescription' on 'RTCPeerConnection': The RTCPeerConnection's signalingState is 'closed'.

HermanBilous avatar Mar 08 '24 15:03 HermanBilous

Right after error on setRemoteDescription I get

InvalidStateError: Failed to execute 'addIceCandidate' on 'RTCPeerConnection': The RTCPeerConnection's signalingState is 'closed'.

It originates in client.onTrickle

HermanBilous avatar Mar 08 '24 15:03 HermanBilous

Thanks! How do the client logs continue after the setRemoteDescriptionError? Just trying to make sure that the reconnect logic behaves as expected now.

lukasIO avatar Mar 08 '24 15:03 lukasIO

Hm, that's a good question. Unfortunately, we don't have tracking of what's happening after the error, I'm not even sure if this is possible with sentry. But, a good thing is that there are only 1 such error reported, not 3 of them consecutively, like I saw before.

HermanBilous avatar Mar 08 '24 15:03 HermanBilous

It would be great if we were to replicate this issue in an isolated manner. Any hints on what kind of client (OS + browser combination) runs into this most frequently or what might be causing this reconnect pattern in the first place?

lukasIO avatar Mar 08 '24 15:03 lukasIO

I think this is somehow connected to the server. E.g. room migrates, or lk server becomes unavailable for some reason and shutdowns abruptly. As for the browser, I see that it happens on chrome / edge, on windows, mac and even linux. The livekit-server we use is 1.5.0

HermanBilous avatar Mar 09 '24 07:03 HermanBilous

@HermanBilous I'll close this issue for now. We haven't heard any similar reports. If you're still seeing this issue with the latest client+server versions feel free to let me know and I'll reopen it.

lukasIO avatar Aug 26 '24 12:08 lukasIO

I've noticed this happen quite a bit ("livekit-client": "^2.8.0"). It seems occur when I move my laptop between home and office wifi or home and hotspot.

In saying that, I haven't been able to reproduce it by toggling between home wifi and hotspot.

mdelbo avatar Jan 23 '25 22:01 mdelbo

Hi @lukasIO,

I’m encountering this issue on the latest versions of livekit client & server. Logs show the same error pattern (setRemoteDescription failing due to the signaling state being closed).

Would it be possible to reopen this issue for further investigation?

Let me know if you need more details!

Error: Failed to execute 'setRemoteDescription' on 'RTCPeerConnection': The RTCPeerConnection's signalingState is 'closed'.
    at PCTransport.eval (webpack-internal:///./node_modules/livekit-client/dist/livekit-client.esm.mjs:15208:15)
    at Generator.throw (<anonymous>)
    at rejected (webpack-internal:///./node_modules/livekit-client/dist/livekit-client.esm.mjs:6713:65)
InvalidStateError: Failed to execute 'addIceCandidate' on 'RTCPeerConnection': The RTCPeerConnection's signalingState is 'closed'.
    at PCTransport.eval (webpack-internal:///./node_modules/livekit-client/dist/livekit-client.esm.mjs:14887:24)
    at Generator.next (<anonymous>)
    at eval (webpack-internal:///./node_modules/livekit-client/dist/livekit-client.esm.mjs:6715:71)
    at new Promise (<anonymous>)
    at __awaiter (webpack-internal:///./node_modules/livekit-client/dist/livekit-client.esm.mjs:6711:12)
    at PCTransport.addIceCandidate (webpack-internal:///./node_modules/livekit-client/dist/livekit-client.esm.mjs:14885:12)
    at PCTransportManager.eval (webpack-internal:///./node_modules/livekit-client/dist/livekit-client.esm.mjs:15506:31)
    at Generator.next (<anonymous>)
    at eval (webpack-internal:///./node_modules/livekit-client/dist/livekit-client.esm.mjs:6715:71)
    at new Promise (<anonymous>)
    at __awaiter (webpack-internal:///./node_modules/livekit-client/dist/livekit-client.esm.mjs:6711:12)
    at PCTransportManager.addIceCandidate (webpack-internal:///./node_modules/livekit-client/dist/livekit-client.esm.mjs:15502:12)
    at client.onTrickle (webpack-internal:///./node_modules/livekit-client/dist/livekit-client.esm.mjs:17267:22)
    at SignalClient.handleSignalResponse (webpack-internal:///./node_modules/livekit-client/dist/livekit-client.esm.mjs:13868:14)
    at SignalClient.eval (webpack-internal:///./node_modules/livekit-client/dist/livekit-client.esm.mjs:13605:16)
    at Generator.next (<anonymous>)
    at eval (webpack-internal:///./node_modules/livekit-client/dist/livekit-client.esm.mjs:6715:71)
    at new Promise (<anonymous>)
    at __awaiter (webpack-internal:///./node_modules/livekit-client/dist/livekit-client.esm.mjs:6711:12)
    at ws.onmessage (webpack-internal:///./node_modules/livekit-client/dist/livekit-client.esm.mjs:13549:35)

mdelbo avatar Jan 27 '25 22:01 mdelbo

For further investigation we'd need a way to (somewhat) reliably reproduce this issue.

If you have a way of reproducing it on latest client feel free to open a new issue with all the details, thanks!

lukasIO avatar Jan 28 '25 07:01 lukasIO