Omit SIGKILL event in centos7 with kernel 3.10.0
I think sysdig may omit some event in centos7 with kernel 3.10.0
Runtime Information
System: CentOS Linux release 7.9.2009 (Core)
Kernel: 3.10.0-1160.el7.x86_64
Sysdig: 0.35.1
Docker:
Client: Docker Engine - Community
Version: 20.10.23
API version: 1.40
Go version: go1.18.10
Git commit: 7155243
Built: Thu Jan 19 17:36:21 2023
OS/Arch: linux/amd64
Context: default
Experimental: true
Server: Docker Engine - Community
Engine:
Version: 19.03.15
API version: 1.40 (minimum version 1.12)
Go version: go1.13.15
Git commit: 99e3ed8919
Built: Sat Jan 30 03:16:33 2021
OS/Arch: linux/amd64
Experimental: true
containerd:
Version: 1.6.15
GitCommit: 5b842e528e99d4d4c1686467debf2bd4b88ecd86
runc:
Version: 1.0.3
GitCommit: v1.0.3-0-gf46b6ba
docker-init:
Version: 0.18.0
GitCommit: fec3683
Problem
I want to detect SIGKILL signal event inside container. So I run the below command on the host machine.
sysdig evt.type=kill and evt.arg.sig=SIGKILL
Below is the output.
kill inside container
Then I execute a kill command inside a k8s docker container located at this host machine.
I cannot see the kill command's system call event in the first picture.
kill inside host machine
Obviously, I still cannot see the kill command's event.
Conclusion
I think sysdig cannot work well in the centos7 with kernel 3.10.0. Or how can I make it work well?
Hi @liuyaqiu! I just tried but I cannot reproduce the issue with docker. Could you give a try with docker only? Has the host any particular configuration? Is the host on high load?
The host machine has 256 CPU cores and the load average is about 60. I don't know whether it is too high.
Does this issue happen all the time?
Does this issue happen all the time?
Difference to previous situation
- host machine load average is lower than previous
- sysdig package is removed from host machine.
- sysdig running inside docker container(use official image
sysdig/sysdig:0.31.5
Now the host machine load average is low:
And I can see the SIGKILL events in time, sysdig works well. I write a python script to sleep and kill it self:
import os
import signal
import time
print("My PID is:", os.getpid())
# Sleep for 10 seconds
time.sleep(10)
# Kill self with SIGKILL
os.kill(os.getpid(), signal.SIGKILL)
I run this script inside another container.
The sysdig run also run inside a container while the sysdig package is removed from the host machine.
sysdig evt.type=kill and evt.arg.sig=SIGKILL
The sender pid is in host machine's root namespace, the receiver pid is in container's namespace.
@therealbobo Now everything looks working well. Thanks for your help. And I will still pay attention to this problem to see whether it reproduces.
The only thing I could think of (other than a bug in the drivers) is that sysdig is dropping events, probably due to the syscall buffer being too small. Currently sysdig doesn't support a variable size ring buffer. I'll work on it. Please ping me if the problem shows up on low load. Thank you to bring up this to our attention! @liuyaqiu 😄