simple-peer icon indicating copy to clipboard operation
simple-peer copied to clipboard

Segmentation fault from node-webrtc when accessing RTCPeerConnection.localDescription

Open dguenther opened this issue 4 years ago • 7 comments

What version of this package are you using?

  • v9.11.0

What operating system, Node.js, and npm version?

I haven't tested this in other environments.

  • macOS 11.3
  • Node.js v14.16.1
  • npm 6.14.12
  • wrtc 0.4.7

What happened?

When creating, connecting, and destroying several SimplePeer instances in Node.js, wrtc crashes with a segmentation fault:

PID 4228 received SIGSEGV for address: 0x0
0   segfault-handler.node               0x00000001046bdfb0 _ZL16segfault_handleriP9__siginfoPv + 304
1   libsystem_platform.dylib            0x00007fff203b3d7d _sigtramp + 29
2   ???                                 0x0000000200583232 0x0 + 8595714610
3   wrtc.node                           0x00000001151e4e86 _ZNK6webrtc22JsepSessionDescription8ToStringEPNSt3__112basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEE + 38
4   wrtc.node                           0x0000000115028fe2 _ZN11node_webrtc9ConverterIPKN6webrtc27SessionDescriptionInterfaceENS_25RTCSessionDescriptionInitEE7ConvertES4_ + 82
5   wrtc.node                           0x0000000115029b60 _ZN11node_webrtc9ConverterINSt3__14pairIN4Napi3EnvEPKN6webrtc27SessionDescriptionInterfaceEEENS3_5ValueEE7ConvertES9_ + 48
6   wrtc.node                           0x000000011510416f _ZN11node_webrtc17RTCPeerConnection19GetLocalDescriptionERKN4Napi12CallbackInfoE + 111
7   wrtc.node                           0x0000000115119bdb _ZZN4Napi10ObjectWrapIN11node_webrtc17RTCPeerConnectionEE29InstanceGetterCallbackWrapperEP10napi_env__P20napi_callback_info__ENKUlvE_clEv + 139
8   wrtc.node                           0x0000000115119a9a _ZN4Napi10ObjectWrapIN11node_webrtc17RTCPeerConnectionEE29InstanceGetterCallbackWrapperEP10napi_env__P20napi_callback_info__ + 42
9   node                                0x000000010006b94a _ZN6v8impl12_GLOBAL__N_123FunctionCallbackWrapper6InvokeERKN2v820FunctionCallbackInfoINS2_5ValueEEE + 122
10  node                                0x0000000100a0bacd Builtins_CallApiCallback + 173
[1]    4227 segmentation fault  npm start

Reproduction case

I created a repository with a demo, and also pasted example code below. Unfortunately it's not deterministic, but when running it in 4 windows, it crashes before 1000 iterations in at least one of them.

https://github.com/dguenther/simple-peer-issue-demo

Click to expand example code block
require('segfault-handler').registerHandler('segfault.log')
const SimplePeer = require('simple-peer')
const wrtc = require('wrtc')

const LOOP_TIME_MS = 70

function getRandomInt(min, max) {
  min = Math.ceil(min);
  max = Math.floor(max);
  return Math.floor(Math.random() * (max - min + 1)) + min;
}

let iteration = 0

const initiators = []
const recipients = []

async function eventLoop() {
  console.log(`Iteration ${++iteration}`)
  
  while (initiators.length > 20) {
    const conn = initiators.splice(getRandomInt(0, initiators.length - 1), 1)[0]
    conn.destroy()
  }
  
  while (recipients.length > 20) {
    const conn = recipients.splice(getRandomInt(0, initiators.length - 1), 1)[0]
    conn.destroy()
  }
  
  for (let i = 0; i < 4; i++) {
    const recip = new SimplePeer({ initiator: false, wrtc })
    const init = new SimplePeer({ initiator: true, wrtc })

    recip.on('signal', (signal) => {
      if (!init.destroyed) init.signal(signal)
    })
    init.on('signal', (signal) => {
      if (!recip.destroyed) recip.signal(signal)
    })
    
    initiators.push(init)
    recipients.push(recip)
  }
  
  setTimeout(eventLoop, LOOP_TIME_MS)
}

eventLoop()

What did you expect to happen?

No crash 😄 Since ultimately it should be node-webrtc's responsibility to manage itself without crashing, I've created an issue here: https://github.com/node-webrtc/node-webrtc/issues/696

However, I noticed that removing this._pc.localDescription from this line fixes the crash:

https://github.com/feross/simple-peer/blob/d972548299a50f836ca91c36e39304ef0f9474b7/index.js#L618

Several WebRTC examples seem to pass localDescription to the other peer rather than passing the offer itself, so I wasn't sure if there was a reason for that, or if this is a viable workaround.

Are you willing to submit a pull request to fix this bug?

👍 Yep, if one is necessary.

dguenther avatar May 13 '21 23:05 dguenther

@dguenther This appears to be an issue with the wrtc library itself. I tried your workaround by removing this._pc.localDescription from line 618, but still got the segmentation fault. Is node-wrtc even being maintained anymore? It has been 3 years since any update to the code. Is there an alternative library?

If it helps, I'm using Node v20.3.1 on a 2020 Macbook Pro MacOS 13.4.1 (22F82)

draeder avatar Jul 06 '23 14:07 draeder

I've been running node-datachannel successfully for a time, but I've only tested the data channels, not the media support.

Another option I've seen is werift-webrtc. When I tested it a while ago, the performance was not great relative to node-datachannel and node-wrtc, but judging by the GitHub issues, it looks like there have been some improvements since.

dguenther avatar Jul 06 '23 15:07 dguenther

@dguenther Perfect! werift appears to be a drop-in replacement for node-wrtc that works great with simple-peer. Many thanks!

draeder avatar Jul 06 '23 18:07 draeder

@dguenther Well, crud. werift-webrtc has the same segmentation fault issue as node-webrtc.

draeder avatar Jul 06 '23 22:07 draeder

Is there an issue for it? Offhand I wouldn't expect that, since as far as I know werift doesn't use any native modules.

dguenther avatar Jul 06 '23 23:07 dguenther

So, I think the error was in my own code for both node-wrtc and for werift-webrtc.

I am using simple-peer to create a partial mesh, and I think my partial mesh peer rebalancing logic created the problem. I have had my partial mesh network up and running fine since last night with no errors or crashes in both node and the browser. I am back to using node-wrtc.

My research into this bug suggests that it occurs when peers quickly disconnect/reconnect repeatedly. I think that's what was happening in my case.

draeder avatar Jul 07 '23 14:07 draeder

@dguenther Here's what I found out. I am building on webtorrent-hybrid. It uses node-webrtc v0.4.6. Not knowing that, I thought I still needed to pass in an instance of wrtc. Furthermore, I thought I needed to develop a partial-mesh networking protocol. I didn't realize that webtorrent-hybrid already handles that for me. The issue is a memory addressing issue, and the conflicts I describe appear to have created it..

draeder avatar Jul 10 '23 23:07 draeder