NPE under load
I did some stress testing using web sockets and I get a NullPointerException sometimes.
edu_gemini_seqexec_web_server[ERROR] java.lang.NullPointerException
edu_gemini_seqexec_web_server[ERROR] at org.http4s.blaze.util.BufferTools$.go$3(BufferTools.scala:194)
edu_gemini_seqexec_web_server[ERROR] at org.http4s.blaze.util.BufferTools$.areDirectOrEmpty(BufferTools.scala:202)
edu_gemini_seqexec_web_server[ERROR] at org.http4s.blaze.channel.nio1.NIO1SocketServerGroup$SocketChannelHead.performWrite(NIO1SocketServerGroup.scala:221)
edu_gemini_seqexec_web_server[ERROR] at org.http4s.blaze.channel.nio1.NIO1HeadStage.writeReady(NIO1HeadStage.scala:102)
edu_gemini_seqexec_web_server[ERROR] at org.http4s.blaze.channel.nio1.SelectorLoop.run(SelectorLoop.scala:131)
I'm testing this using http4s version 0.16.x and running several simultaneous clients using artillery
This seems likely considering error rates seen in the TechEmpower benchmarks for both Http4s and Blaze. Just had never been able to reproduce locally.
Maybe I can reproduce my test with a minimal example, let me try to get that done soon
I wonder if at high load write interests are leaking through even after a socket has been unregistered interest in the write operation. If so, https://github.com/http4s/blaze/blob/v0.12.6/core/src/main/scala/org/http4s/blaze/channel/nio1/NIO1HeadStage.scala#L101 would end up with a null.
I can reproduce it agains http4s' BlazeWebSocketExample running from sbt using artillery and the configuration file below with the command
artillery run wsstress.yml
Note that I see the NPE on the release-0.16.x branch but not on master
wsstress.yml:
config:
target: "ws://localhost:8080/http4s/ws"
phases:
-
duration: 120
arrivalRate: 50
rampTo: 100
ws:
# Ignore SSL certificate errors
# - useful in *development* with self-signed certs
rejectUnauthorized: false
scenarios:
-
engine: "ws"
flow:
-
send: "hello"
-
think: 1
-
send: "world"