Go uprobe attachment causes application segfault in certain scenarios
We've received a report that an application running filebeat v7.17.5 experiences segfaults on calls to crypto/tls.(*Conn).Write after deploying Pixie. These crashes disappear if Pixie is redeployed with Go tls tracing disabled (via PX_STIRLING_DISABLE_GOLANG_TLS_TRACING added in #1534). An example stack trace can be seen below for an application experiencing this:
2023-05-18T10:42:34.883Z INFO [publisher_pipeline_output] pipeline/output.go:151 Connection to failover(backoff(async(tcp://XXXXXX:443)),backoff(async(tcp://YYYYY:5044))) established
unexpected fault address 0xeb7dde57
fatal error: fault
[signal SIGSEGV: segmentation violation code=0x2 addr=0xeb7dde57 pc=0x8ddf310]
goroutine 76 [running]:
runtime.throw({0xa6c6911, 0x5})
/usr/local/go/src/runtime/panic.go:992 +0x6a fp=0xc7bcd40 sp=0xc7bcd2c pc=0x8a8b4da
runtime.sigpanic()
/usr/local/go/src/runtime/signal_unix.go:825 +0x1e7 fp=0xc7bcd58 sp=0xc7bcd40 pc=0x8aa2387
crypto/tls.(*Conn).Write(0xc4eba00, {0xc77c800, 0x217, 0x3e2})
/usr/local/go/src/crypto/tls/conn.go:1107 fp=0xc7bcd5c sp=0xc7bcd58 pc=0x8ddf310
github.com/elastic/beats/v7/libbeat/common/transport.(*Client).Write(0xc440e60, {0xc77c800, 0x217, 0x3e2})
/go/src/github.com/elastic/beats/libbeat/common/transport/client.go:149 +0x57 fp=0xc7bcd80 sp=0xc7bcd5c pc=0x8f69b47
github.com/elastic/go-lumber/client/v2.(*Client).Send(0xc7ae280, {0xc7b24f0, 0x1, 0x1})
/go/pkg/mod/github.com/elastic/[email protected]/client/v2/client.go:146 +0x383 fp=0xc7bcdc4 sp=0xc7bcd80 pc=0x96340f3
github.com/elastic/go-lumber/client/v2.(*AsyncClient).Send(0xc7caac8, 0xc7b24f8, {0xc7b24f0, 0x1, 0x1})
/go/pkg/mod/github.com/elastic/[email protected]/client/v2/async.go:103 +0x3f fp=0xc7bcdf8 sp=0xc7bcdc4 pc=0x963364f
github.com/elastic/beats/v7/libbeat/outputs/logstash.(*asyncClient).sendEvents(0xc87f350, 0xc6051a0, {0xc64e000, 0x1, 0x800})
/go/src/github.com/elastic/beats/libbeat/outputs/logstash/async.go:222 +0x14b fp=0xc7bce20 sp=0xc7bcdf8 pc=0x9635b8b
github.com/elastic/beats/v7/libbeat/outputs/logstash.(*asyncClient).Publish(0xc87f350, {0xab0ab84, 0xc44a088}, {0xe4d995c0, 0xc7ae020})
/go/src/github.com/elastic/beats/libbeat/outputs/logstash/async.go:167 +0x24c fp=0xc7bcea4 sp=0xc7bce20 pc=0x96355dc
github.com/elastic/beats/v7/libbeat/outputs.(*backoffClient).Publish(0xc535fb0, {0xab0ab84, 0xc44a088}, {0xe4d995c0, 0xc7ae020})
/go/src/github.com/elastic/beats/libbeat/outputs/backoff.go:61 +0x4a fp=0xc7bcecc sp=0xc7bcea4 pc=0x92824da
github.com/elastic/beats/v7/libbeat/outputs.(*failoverClient).Publish(0xc712c80, {0xab0ab84, 0xc44a088}, {0xe4d995c0, 0xc7ae020})
/go/src/github.com/elastic/beats/libbeat/outputs/failover.go:100 +0x5d fp=0xc7bceec sp=0xc7bcecc pc=0x928288d
github.com/elastic/beats/v7/libbeat/publisher/pipeline.(*netClientWorker).publishBatch(0xc87f9b0, {0xe4d995c0, 0xc7ae020})
/go/src/github.com/elastic/beats/libbeat/publisher/pipeline/output.go:176 +0x1cc fp=0xc7bcf8c sp=0xc7bceec pc=0x93dbe1c
github.com/elastic/beats/v7/libbeat/publisher/pipeline.(*netClientWorker).run(0xc87f9b0)
/go/src/github.com/elastic/beats/libbeat/publisher/pipeline/output.go:161 +0xe9 fp=0xc7bcfe8 sp=0xc7bcf8c pc=0x93db969
github.com/elastic/beats/v7/libbeat/publisher/pipeline.makeClientWorker.func1()
/go/src/github.com/elastic/beats/libbeat/publisher/pipeline/output.go:79 +0x2a fp=0xc7bcff0 sp=0xc7bcfe8 pc=0x93db62a
runtime.goexit()
/usr/local/go/src/runtime/asm_386.s:1326 +0x1 fp=0xc7bcff4 sp=0xc7bcff0 pc=0x8abe811
created by github.com/elastic/beats/v7/libbeat/publisher/pipeline.makeClientWorker
/go/src/github.com/elastic/beats/libbeat/publisher/pipeline/output.go:79 +0x217
This indicates that there are certain applications where the go uprobe attachment causes application crashes. At the moment, we don't understand what type of go applications are susceptible to this. I attempted to reproduce the issue by deploying filebeat v7.17.10 (latest v7.17.x release available upstream) but was unsuccessful.
It appears from the stack trace that this might be a 32 bit Go binary as seen from the following log line.
runtime.goexit()
/usr/local/go/src/runtime/asm_386.s:1326 +0x1 fp=0xc7bcff4 sp=0xc7bcff0 pc=0x8abe811