raiden Does WebRTC need keepalive?

We need to further our understanding of the properties of WebRTC sessions. The specific question of this issue is: Since Raiden messaging via webRTC is non-continuous (unlike audio/video), do we need to add some active keepalive messaging in order to keep ports open?

My preliminary research suggests no, given an up-to-spec RTCPeerConnection implementation: https://hpbn.co/webrtc/#rtcpeerconnection

[ ] Does our WebRTC implementation have a working keepalive
[ ] Test WebRTC through NAT

Nov 09 '20 09:11 konradkonrad

I would like to add an additional concern to this topic. Even though we probably do not need to have some keep alive in order to keep the connection open, there are findings that close the connection on one side while the other doesn't notice.

There can be multiple reasons for it. One I have found so far is Raiden Node crashing. The LC (or rather the js implementation of webRTC) sometimes causes an unexpected behavior that the channel seems to be open but messages are getting dropped (and CPU load at 100%). A keep alive message would discover such failures and trigger a new channel creation.

Nov 09 '20 10:11 fredo

It also seems on first sight, that aiortc does not deal with keepalive, see https://github.com/aiortc/aiortc/issues/225#issuecomment-555752962

Nov 09 '20 10:11 konradkonrad

Update: The error in the LC library does not seem to happen anymore. Besides that keepalive must be planned together with the LC as a feature although I'm still not sure if this is necessary. I had a recent test where a channel was open for 20 minutes without sending any messages. In other tests, the connection broke (for some reason). @andrevmatos do you have any opinions on that?

Dec 08 '20 16:12 fredo

I also do get the consent checks fail from time to time. This was also described in the linked issue. So it might be the case that our implementation, again, deals differently as browser implementations.

Dec 08 '20 16:12 fredo

I think WebRTC does have keep-alive out-of-the-box. We have not had issues with it, even with connections living through several minutes, although we've seldom seen connectionstate becoming failed after some time, but it seems not related to keepalive and instead some internal connection issue, which got solved by closing and retrying the channel.

Dec 08 '20 16:12 andrevmatos

Do you remember what the failure looked like? I received ICE consent check failed a couple of times after a while

Dec 08 '20 17:12 fredo

We didn't get an error out of it. Just the connectionstate change event got emitted when the state becomes failed or something like that, no error, and then we use it to identify it and teardown+retry the connection.

Dec 09 '20 14:12 andrevmatos