fix process leakage in container delete when use share pid namespace
If container b joins container a's pid namespace, once delete container b, it may cause leakage of process in containerb.
For example:
Use busybox image as rootfs, with start arg ["sleep", "100000"]:
For container a: use new pid namesapce without path: "namespaces": [{"type": "pid"}
root@demo:/opt/busybox# ../runc/runc run -d a
root@demo:/opt/busybox# ../runc/runc list
ID PID STATUS BUNDLE CREATED OWNER
a 3162 running /opt/busybox 2019-04-08T11:34:02.449638033Z root
root@demo:/opt/busybox# ../runc/runc ps a
UID PID PPID C STIME TTY TIME CMD
root 3162 1 0 19:34 ? 00:00:00 sleep 100000
For container b: use new pid namesapce with path: "namespaces": [{"type": "pid", "path": "/proc/3162/ns/pid"}
root@demo:/opt/busybox# ../runc/runc run -d b
root@demo:/opt/busybox# ../runc/runc list
ID PID STATUS BUNDLE CREATED OWNER
a 3162 running /opt/busybox 2019-04-08T11:34:02.449638033Z root
b 3581 running /opt/busybox 2019-04-08T11:35:05.241568752Z root
root@demo:/opt/busybox# ../runc/runc exec -d b sleep 20000
root@demo:/opt/busybox# ../runc/runc ps b
UID PID PPID C STIME TTY TIME CMD
root 3581 1 0 19:35 ? 00:00:00 sleep 100000
root 6256 1 0 19:42 ? 00:00:00 sleep 20000
root@demo:/opt/busybox# ../runc/runc kill b 9
root@demo:/opt/busybox# ../runc/runc list
ID PID STATUS BUNDLE CREATED OWNER
a 3162 running /opt/busybox 2019-04-08T11:34:02.449638033Z root
b 0 stopped /opt/busybox 2019-04-08T11:35:05.241568752Z root
root@demo:/opt/busybox# ../runc/runc delete b
root@demo:/opt/busybox# ps -ef | grep 6256
root 6256 1 0 19:42 ? 00:00:00 sleep 20000
And container b's cgroup path is not deleted.
Signed-off-by: Lifubang [email protected]
There are 2 ways to fix this problem:
- use
signalAllProcesseswhen destroy the container; - throw the error to
runc deletemethod.
For this PR, I choose the first way, if the maintainers feel the second way more better, I'll change the code.
@lifubang can you please amend the commit message with the info about the repro (i.e. what you have in this PR description)?
Otherwise LGTM
@cyphar @AkihiroSuda PTAL
can you please amend the commit message with the info about the repro
updated
seems to be supported here. this should be closed cc @kolyshkin
Yes, this was independently fixed by https://github.com/opencontainers/runc/pull/2085