More robust mechanism for getting strace events from gVisor
error: "Failed to parse openat args: AT_FDCWD /tmp/pip-install-ok6jlfy9/adafruit-circuitpython-busdevice_f20711ccac244452962292b5236c4d8f, 0x7f176abd3d40 , O_RDONLY|O_CLOEXEC, 0o0) = 0 (0x0) errno=2 (no such file or directory) (2.417µs"
Log entry:
{
"jsonPayload": {
"error": "Failed to parse openat args: AT_FDCWD /tmp/pip-install-ok6jlfy9/adafruit-circuitpython-busdevice_f20711ccac244452962292b5236c4d8f, 0x7f176abd3d40 , O_RDONLY|O_CLOEXEC, 0o0) = 0 (0x0) errno=2 (no such file or directory) (2.417µs",
"timestamp": "2022-05-03T21:57:55.238696724Z",
"caller": "strace/strace.go:264",
"message": "Failed to parse syscall"
},
"resource": {
"type": "k8s_container",
"labels": {
"namespace_name": "default",
"cluster_name": "analysis-cluster",
"project_id": "ossf-malware-analysis",
"pod_name": "workers-set-28",
"location": "us-central1-c",
"container_name": "worker"
}
},
"timestamp": "2022-05-03T21:57:55.238935409Z",
"severity": "WARNING",
"labels": {
},
"logName": "projects/ossf-malware-analysis/logs/stderr",
"sourceLocation": {
"file": "/src/internal/strace/strace.go",
"line": "264",
"function": "github.com/ossf/package-analysis/internal/strace.Parse"
},
"receiveTimestamp": "2022-05-03T21:57:56.544994861Z"
}
There are a few issue it seems:
- empty paths are not handled correctly.
- paths with commas are not handled correctly.
Another slightly different example demonstrating the comma issue:
{
"insertId": "dr8h7nh1k8azcrot",
"jsonPayload": {
"timestamp": "2022-05-10T20:33:07.991624188Z",
"caller": "strace/strace.go:264",
"message": "Failed to parse syscall",
"error": "Failed to parse openat args: AT_FDCWD /app, 0x517d230 /app/node_modules/@videojs/http-streaming/node_modules/@videojs/vhs-utils/test/fixtures/formats/webm/av01.0.00M.08.0.110,vorbis.webm, O_WRONLY|O_CLOEXEC|O_CREAT|O_TRUNC, 0o666) = 18 (0x12) (2.911638ms"
},
"resource": {
"type": "k8s_container",
"labels": {
"location": "us-central1-c",
"namespace_name": "default",
"container_name": "worker",
"cluster_name": "analysis-cluster",
"pod_name": "workers-set-32",
"project_id": "ossf-malware-analysis"
}
},
"timestamp": "2022-05-10T20:33:07.991874992Z",
"severity": "WARNING",
"labels": {
"compute.googleapis.com/resource_name": "gke-analysis-cluster-default-pool-d94f0570-lqmn",
"k8s-pod/app": "workers",
"k8s-pod/statefulset_kubernetes_io/pod-name": "workers-set-32",
"k8s-pod/controller-revision-hash": "workers-set-5464844d7d"
},
"logName": "projects/ossf-malware-analysis/logs/stderr",
"sourceLocation": {
"file": "/src/internal/strace/strace.go",
"line": "264",
"function": "github.com/ossf/package-analysis/internal/strace.Parse"
},
"receiveTimestamp": "2022-05-10T20:33:08.885073572Z"
}
The strace parsing from GVisor needs much more improvement to avoid these parsing errors.
The plan is to look into GVisor's support for connecting to the sentry via a UNIX socket (see: https://github.com/google/gvisor/blob/master/pkg/sentry/seccheck/checkers/remote/remote.go)
I figured out possible a way to do this. The runsc events command supports the ability to -stream debug events.
By passing -strace and -strace-event to the runsc binary at start the strace events will be sent to a unix socket that can be connected to.
I was able to successfully get this working by manually invoking podman and runsc.
Given that we currently start the container before execing the analysis means we can easily start the strace stream before the analysis.
That definitely seems promising! Is there documentation on the format here?
I haven't found any documentation, but the code isn't terribly hard to follow (see https://github.com/google/gvisor/blob/master/pkg/eventchannel/processor.go and https://github.com/google/gvisor/blob/master/pkg/eventchannel/event.go).
It's a binary stream consisting of a binary.Uvarint() message length followed by an Any message from google.golang.org/protobuf/types/known/anypb.
The Any message encapsulates a DebugEvent from https://github.com/google/gvisor/blob/master/pkg/eventchannel/event.proto, and the Text field holds an instance of the Strace message from https://github.com/google/gvisor/blob/master/pkg/sentry/strace/strace.proto.
For reference, here is the output from runsc events -stream from /bin/bash -c "while true; do echo 'hi' ; done":
...
2022-08-04 08:49:43.044082455 +0000 UTC m=+5.865181584
gvisor.Strace {
process:"bash" function:"write" args:"0x1 host:[2]" args:"..." args:"0x3" exit:{return:"0x3" elapsed_ns:10208}}
2022-08-04 08:49:43.044128928 +0000 UTC m=+5.865228069
gvisor.Strace {
process:"bash" function:"write" args:"0x1 host:[2]" args:"" args:"0x3" enter:{}}
2022-08-04 08:49:43.044198949 +0000 UTC m=+5.865298085
gvisor.Strace {
process:"bash" function:"write" args:"0x1 host:[2]" args:"..." args:"0x3" exit:{return:"0x3" elapsed_ns:8108}}
...