conmon leave main container process in zombie state
I observed a strange behavior. We run Prometheus on our Kuberentes, however, it usually gets stuck when Kubernetes restart the container.
CRI-O logs:
Mar 27 22:35:38 k8s-sys-w1 crio[1364]: time="2024-03-27 22:35:38.600423205+08:00" level=info msg="Stopping container: 03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6 (timeout: 600s)" id=c60a252b-7eb3-4337-96aa-398fb115db16 name=/runtime.v1.RuntimeService/StopContainer
Mar 27 22:45:38 k8s-sys-w1 crio[1364]: time="2024-03-27 22:45:38.615546831+08:00" level=warning msg="Stopping container 03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6 with stop signal timed out: timeout reached after 600 seconds waiting for container process to exit" id=c60a252b-7eb3-4337-96aa-398fb115db16 name=/runtime.v1.RuntimeService/StopContainer
Mar 27 22:47:38 k8s-sys-w1 crio[1364]: time="2024-03-27 22:47:38.945321438+08:00" level=info msg="Stopping container: 03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6 (timeout: 600s)" id=eb4ce11f-99d2-40d4-8048-68a9f918a943 name=/runtime.v1.RuntimeService/StopContainer
Mar 27 22:57:38 k8s-sys-w1 crio[1364]: time="2024-03-27 22:57:38.959805291+08:00" level=warning msg="Stopping container 03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6 with stop signal timed out: timeout reached after 600 seconds waiting for container process to exit" id=eb4ce11f-99d2-40d4-8048-68a9f918a943 name=/runtime.v1.RuntimeService/StopContainer
The prometheus process should be the PID 1 in the PID namespace, after it died, the whole namespace should be killed by kernel.
However, the conmon leave the prometheus process in zombie state. Thus, the container get stuck.
$ sudo pstree -plTS 1859180
conmon(1859180)───prometheus(1859182,pid)
$ ps ufS 1859180 1859182
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1859180 0.0 0.0 70152 2284 ? Ss Mar27 0:00 /usr/bin/conmon <omitted...>
ansible 1859182 0.0 0.0 0 0 ? Zsl Mar27 0:00 \_ [prometheus] <defunct>
I tried to use strace to see what conmon is doing, and sending some SIGCHLD signal in another terminal.
$ sudo strace -p 1859180
strace: Process 1859180 attached
restart_syscall(<... resuming interrupted restart_syscall ...>) = 1
write(5, "\1\0\0\0\0\0\0\0", 8) = 8
read(17, "\21\0\0\0\0\0\0\0\0\0\0\0\\\352(\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 128) = 128
wait4(-1, 0x7ffec7059b70, WNOHANG, NULL) = 0
write(5, "\1\0\0\0\0\0\0\0", 8) = 8
poll([{fd=5, events=POLLIN}, {fd=8, events=POLLIN}, {fd=10, events=POLLIN}, {fd=12, events=POLLIN}, {fd=13, events=POLLIN}, {fd=17, events=POLLIN}, {fd=18, events=POLLIN}], 7, -1) = 1 ([{fd=5, revents=POLLIN}])
read(5, "\2\0\0\0\0\0\0\0", 16) = 8
poll([{fd=5, events=POLLIN}, {fd=8, events=POLLIN}, {fd=10, events=POLLIN}, {fd=12, events=POLLIN}, {fd=13, events=POLLIN}, {fd=17, events=POLLIN}, {fd=18, events=POLLIN}], 7, -1) = 1 ([{fd=17, revents=POLLIN}])
write(5, "\1\0\0\0\0\0\0\0", 8) = 8
read(17, "\21\0\0\0\0\0\0\0\0\0\0\0>\3)\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 128) = 128
wait4(-1, 0x7ffec7059b70, WNOHANG, NULL) = 0
write(5, "\1\0\0\0\0\0\0\0", 8) = 8
poll([{fd=5, events=POLLIN}, {fd=8, events=POLLIN}, {fd=10, events=POLLIN}, {fd=12, events=POLLIN}, {fd=13, events=POLLIN}, {fd=17, events=POLLIN}, {fd=18, events=POLLIN}], 7, -1) = 1 ([{fd=5, revents=POLLIN}])
read(5, "\2\0\0\0\0\0\0\0", 16) = 8
poll([{fd=5, events=POLLIN}, {fd=8, events=POLLIN}, {fd=10, events=POLLIN}, {fd=12, events=POLLIN}, {fd=13, events=POLLIN}, {fd=17, events=POLLIN}, {fd=18, events=POLLIN}], 7, -1
Although conmon did call wait4(), however the kernel return with 0. (meaning no process can be waited)
Some system information:
$ crun --version
crun version 1.9.2
commit: 35274d346d2e9ffeacb22cc11590b0266a23d634
rundir: /run/user/17247/crun
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
$ conmon --version
conmon version 2.1.8
commit: 97585789fa36b1cfaf71f2d23b8add1a41f95f50
$ crictl -v
crictl version v1.26.0
$ uname -a
Linux k8s-sys-w1 4.18.0-526.el8.x86_64 #1 SMP Sat Nov 18 00:54:11 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
can you show the pod spec that encounters this? I also wonder if you can try with https://github.com/cri-o/cri-o/pull/7910 which I think will at least cause cri-o not to spit out that error
The pod spec of the Prometheus instance
apiVersion: v1
kind: Pod
metadata:
annotations:
kubectl.kubernetes.io/default-container: prometheus
generateName: prometheus-k8s-
labels:
app.kubernetes.io/component: prometheus
app.kubernetes.io/instance: k8s
app.kubernetes.io/managed-by: prometheus-operator
app.kubernetes.io/name: prometheus
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 2.46.0
controller-revision-hash: prometheus-k8s-6d767649c4
operator.prometheus.io/name: k8s
operator.prometheus.io/shard: "0"
prometheus: k8s
statefulset.kubernetes.io/pod-name: prometheus-k8s-0
name: prometheus-k8s-0
namespace: monitoring
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
labelSelector:
matchLabels:
app.kubernetes.io/component: prometheus
app.kubernetes.io/instance: k8s
app.kubernetes.io/name: prometheus
app.kubernetes.io/part-of: kube-prometheus
namespaces:
- monitoring
topologyKey: kubernetes.io/hostname
weight: 100
automountServiceAccountToken: true
containers:
- args:
- --web.console.templates=/etc/prometheus/consoles
- --web.console.libraries=/etc/prometheus/console_libraries
- --config.file=/etc/prometheus/config_out/prometheus.env.yaml
- --web.enable-lifecycle
- --web.route-prefix=/
- --storage.tsdb.retention.time=30d
- --storage.tsdb.retention.size=100GiB
- --storage.tsdb.path=/prometheus
- --storage.tsdb.wal-compression
- --web.config.file=/etc/prometheus/web_config/web-config.yaml
- --storage.tsdb.max-block-duration=2h
- --storage.tsdb.min-block-duration=2h
image: quay.io/prometheus/prometheus:v2.46.0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 6
httpGet:
path: /-/healthy
port: web
scheme: HTTP
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 3
name: prometheus
ports:
- containerPort: 9090
name: web
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /-/ready
port: web
scheme: HTTP
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 3
resources:
requests:
memory: 400Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
startupProbe:
failureThreshold: 60
httpGet:
path: /-/ready
port: web
scheme: HTTP
periodSeconds: 15
successThreshold: 1
timeoutSeconds: 3
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts:
- mountPath: /etc/prometheus/config_out
name: config-out
readOnly: true
- mountPath: /etc/prometheus/certs
name: tls-assets
readOnly: true
- mountPath: /prometheus
name: prometheus-k8s-db
subPath: prometheus-db
- mountPath: /etc/prometheus/rules/prometheus-k8s-rulefiles-0
name: prometheus-k8s-rulefiles-0
- mountPath: /etc/prometheus/web_config/web-config.yaml
name: web-config
readOnly: true
subPath: web-config.yaml
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-8fjp4
readOnly: true
- args:
- --listen-address=:8080
- --reload-url=http://localhost:9090/-/reload
- --config-file=/etc/prometheus/config/prometheus.yaml.gz
- --config-envsubst-file=/etc/prometheus/config_out/prometheus.env.yaml
- --watched-dir=/etc/prometheus/rules/prometheus-k8s-rulefiles-0
command:
- /bin/prometheus-config-reloader
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: SHARD
value: "0"
image: quay.io/prometheus-operator/prometheus-config-reloader:v0.67.1
imagePullPolicy: IfNotPresent
name: config-reloader
ports:
- containerPort: 8080
name: reloader-web
protocol: TCP
resources:
limits:
cpu: 10m
memory: 50Mi
requests:
cpu: 10m
memory: 50Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts:
- mountPath: /etc/prometheus/config
name: config
- mountPath: /etc/prometheus/config_out
name: config-out
- mountPath: /etc/prometheus/rules/prometheus-k8s-rulefiles-0
name: prometheus-k8s-rulefiles-0
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-8fjp4
readOnly: true
- args:
- sidecar
- --prometheus.url=http://localhost:9090/
- '--prometheus.http-client={"tls_config": {"insecure_skip_verify":true}}'
- --grpc-address=:10901
- --http-address=:10902
- --objstore.config=$(OBJSTORE_CONFIG)
- --tsdb.path=/prometheus
env:
- name: OBJSTORE_CONFIG
valueFrom:
secretKeyRef:
key: bucket.yaml
name: thanos-sidecar-objectstorage
image: quay.io/thanos/thanos:v0.31.0
imagePullPolicy: IfNotPresent
name: thanos-sidecar
ports:
- containerPort: 10902
name: http
protocol: TCP
- containerPort: 10901
name: grpc
protocol: TCP
resources: {}
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts:
- mountPath: /prometheus
name: prometheus-k8s-db
subPath: prometheus-db
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-8fjp4
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
hostname: prometheus-k8s-0
initContainers:
- args:
- --watch-interval=0
- --listen-address=:8080
- --config-file=/etc/prometheus/config/prometheus.yaml.gz
- --config-envsubst-file=/etc/prometheus/config_out/prometheus.env.yaml
- --watched-dir=/etc/prometheus/rules/prometheus-k8s-rulefiles-0
command:
- /bin/prometheus-config-reloader
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: SHARD
value: "0"
image: quay.io/prometheus-operator/prometheus-config-reloader:v0.67.1
imagePullPolicy: IfNotPresent
name: init-config-reloader
ports:
- containerPort: 8080
name: reloader-web
protocol: TCP
resources:
limits:
cpu: 10m
memory: 50Mi
requests:
cpu: 10m
memory: 50Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts:
- mountPath: /etc/prometheus/config
name: config
- mountPath: /etc/prometheus/config_out
name: config-out
- mountPath: /etc/prometheus/rules/prometheus-k8s-rulefiles-0
name: prometheus-k8s-rulefiles-0
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-8fjp4
readOnly: true
nodeSelector:
kubernetes.io/os: linux
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
serviceAccount: prometheus-k8s
serviceAccountName: prometheus-k8s
subdomain: prometheus-operated
terminationGracePeriodSeconds: 600
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- name: prometheus-k8s-db
persistentVolumeClaim:
claimName: prometheus-k8s-db-prometheus-k8s-0
- name: config
secret:
defaultMode: 420
secretName: prometheus-k8s
- name: tls-assets
projected:
defaultMode: 420
sources:
- secret:
name: prometheus-k8s-tls-assets-0
- emptyDir:
medium: Memory
name: config-out
- configMap:
defaultMode: 420
name: prometheus-k8s-rulefiles-0
name: prometheus-k8s-rulefiles-0
- name: web-config
secret:
defaultMode: 420
secretName: prometheus-k8s-web-config
- name: kube-api-access-8fjp4
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
The Prometheus container got restart by hitting the liveness probe (This is caused by our performance issues). Sometimes, it stucks into the strange state above, making the pod into unready state and trigger the alarm. The situation has occurred few times before I report this issues.
I found that the stuck prometheus finally be caught by conmon in the last night.
time="2024-03-29 17:32:41.616478684+08:00" level=info msg="Removed container 03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6: monitoring/prometheus-k8s-0/prometheus" id=648d7266-9fdd-4961-8f04-71e47ebd791f name=/runtime.v1.RuntimeService/RemoveContainer
And the strace I left running in tmux recorded the what conmon did:
[continue from the strace above...] ) = 2 ([{fd=8, revents=POLLHUP}, {fd=10, revents=POLLHUP}])
write(5, "\1\0\0\0\0\0\0\0", 8) = 8
close(8) = 0
write(5, "\1\0\0\0\0\0\0\0", 8) = 8
write(5, "\1\0\0\0\0\0\0\0", 8) = 8
write(5, "\1\0\0\0\0\0\0\0", 8) = 8
close(10) = 0
write(5, "\1\0\0\0\0\0\0\0", 8) = 8
write(5, "\1\0\0\0\0\0\0\0", 8) = 8
poll([{fd=5, events=POLLIN}, {fd=12, events=POLLIN}, {fd=13, events=POLLIN}, {fd=17, events=POLLIN}, {fd=18, events=POLLIN}], 5, -1) = 1 ([{fd=5, revents=POLLIN}])
read(5, "\6\0\0\0\0\0\0\0", 16) = 8
poll([{fd=5, events=POLLIN}, {fd=12, events=POLLIN}, {fd=13, events=POLLIN}, {fd=17, events=POLLIN}, {fd=18, events=POLLIN}], 5, -1) = 1 ([{fd=17, revents=POLLIN}])
write(5, "\1\0\0\0\0\0\0\0", 8) = 8
read(17, "\21\0\0\0\0\0\0\0\2\0\0\0n^\34\0\350\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 128) = 128
wait4(-1, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGKILL}], WNOHANG, NULL) = 1859182
write(2, "[conmon:i]: container 1859182 ex"..., 53) = 53
socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 8
connect(8, {sa_family=AF_UNIX, sun_path="/dev/log"}, 110) = 0
sendto(8, "<14>Mar 29 18:16:48 conmon: conm"..., 107, MSG_NOSIGNAL, NULL, 0) = 107
write(5, "\1\0\0\0\0\0\0\0", 8) = 8
futex(0x55c164181bf0, FUTEX_WAKE_PRIVATE, 2147483647) = 0
wait4(-1, 0x7ffec7059b70, WNOHANG, NULL) = -1 ECHILD (No child processes)
write(5, "\1\0\0\0\0\0\0\0", 8) = 8
futex(0x55c164181bf0, FUTEX_WAKE_PRIVATE, 2147483647) = 0
write(5, "\1\0\0\0\0\0\0\0", 8) = 8
fsync(6) = 0
write(5, "\1\0\0\0\0\0\0\0", 8) = 8
close(17) = 0
close(4) = 0
close(26) = 0
openat(AT_FDCWD, "/var/lib/containers/storage/overlay-containers/03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6/userdata/exit.IUZIL2", O_RDWR|O_CREAT|O_EXCL, 0666) = -1 ENOENT (No such file or directory)
futex(0x7ff5ce103f78, FUTEX_WAKE_PRIVATE, 2147483647) = 0
futex(0x7ff5ce103f78, FUTEX_WAKE_PRIVATE, 2147483647) = 0
write(2, "[conmon:e]: Failed to write 137 "..., 244) = 244
sendto(8, "<11>Mar 29 18:16:48 conmon: conm"..., 298, MSG_NOSIGNAL, NULL, 0) = 298
unlink("/var/run/crio/03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6") = 0
wait4(-1, NULL, WNOHANG, NULL) = -1 ECHILD (No child processes)
exit_group(1) = ?
+++ exited with 1 +++
P.s. I found that I forgot to provide the lsof information of the conmon process.
$ sudo lsof -p 1859180
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
conmon 1859180 root cwd DIR 0,24 200 65836199 /run/containers/storage/overlay-containers/03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6/userdata
conmon 1859180 root rtd DIR 253,0 244 128 /
conmon 1859180 root txt REG 253,0 160080 521925 /usr/bin/conmon
conmon 1859180 root mem REG 253,0 543304 33565316 /usr/lib64/libpcre2-8.so.0.7.1
conmon 1859180 root mem REG 253,0 37240 33572838 /usr/lib64/libffi.so.6.0.2
conmon 1859180 root mem REG 253,0 145984 33571896 /usr/lib64/libgpg-error.so.0.24.2
conmon 1859180 root mem REG 253,0 172104 33565321 /usr/lib64/libselinux.so.1
conmon 1859180 root mem REG 253,0 33480 33572814 /usr/lib64/libuuid.so.1.3.0
conmon 1859180 root mem REG 253,0 343624 33857491 /usr/lib64/libblkid.so.1.1.0
conmon 1859180 root mem REG 253,0 1541856 33572802 /usr/lib64/libgmp.so.10.3.2
conmon 1859180 root mem REG 253,0 197728 33836053 /usr/lib64/libhogweed.so.4.5
conmon 1859180 root mem REG 253,0 239360 33836055 /usr/lib64/libnettle.so.6.5
conmon 1859180 root mem REG 253,0 78816 33835982 /usr/lib64/libtasn1.so.6.5.5
conmon 1859180 root mem REG 253,0 19128 34072407 /usr/lib64/libdl-2.28.so
conmon 1859180 root mem REG 253,0 1805368 33572015 /usr/lib64/libunistring.so.2.1.0
conmon 1859180 root mem REG 253,0 165624 33572822 /usr/lib64/libidn2.so.0.3.6
conmon 1859180 root mem REG 253,0 1246520 33572842 /usr/lib64/libp11-kit.so.0.3.0
conmon 1859180 root mem REG 253,0 1187312 33572817 /usr/lib64/libgcrypt.so.20.2.5
conmon 1859180 root mem REG 253,0 371384 33571984 /usr/lib64/libmount.so.1.1.0
conmon 1859180 root mem REG 253,0 33752 33571882 /usr/lib64/libcap.so.2.48
conmon 1859180 root mem REG 253,0 119760 33835813 /usr/lib64/liblz4.so.1.8.3
conmon 1859180 root mem REG 253,0 162192 33571872 /usr/lib64/liblzma.so.5.2.4
conmon 1859180 root mem REG 253,0 42744 34072419 /usr/lib64/librt-2.28.so
conmon 1859180 root mem REG 253,0 149984 34072415 /usr/lib64/libpthread-2.28.so
conmon 1859180 root mem REG 253,0 464936 33835930 /usr/lib64/libpcre.so.1.2.10
conmon 1859180 root mem REG 253,0 2051656 33839644 /usr/lib64/libgnutls.so.30.28.2
conmon 1859180 root mem REG 253,0 2089936 33566000 /usr/lib64/libc-2.28.so
conmon 1859180 root mem REG 253,0 99672 33591756 /usr/lib64/libgcc_s-8-20210514.so.1
conmon 1859180 root mem REG 253,0 1387696 33857508 /usr/lib64/libsystemd.so.0.23.0
conmon 1859180 root mem REG 253,0 1172024 33839626 /usr/lib64/libglib-2.0.so.0.5600.4
conmon 1859180 root mem REG 253,0 1062344 33565985 /usr/lib64/ld-2.28.so
conmon 1859180 root mem REG 253,0 26998 2987 /usr/lib64/gconv/gconv-modules.cache
conmon 1859180 root 0r CHR 1,3 0t0 1027 /dev/null
conmon 1859180 root 1w CHR 1,3 0t0 1027 /dev/null
conmon 1859180 root 2w CHR 1,3 0t0 1027 /dev/null
conmon 1859180 root 3u unix 0xffff912ba5390d80 0t0 65834737 type=STREAM
conmon 1859180 root 4r CHR 1,3 0t0 1027 /dev/null
conmon 1859180 root 5u a_inode 0,14 0 10538 [eventfd]
conmon 1859180 root 6w REG 253,0 5702 102307849 /var/log/pods/monitoring_prometheus-k8s-0_198479f2-b882-4adb-aacb-02e642699f88/prometheus/1.log (deleted)
conmon 1859180 root 7w CHR 1,3 0t0 1027 /dev/null
conmon 1859180 root 8r FIFO 0,13 0t0 65834742 pipe
conmon 1859180 root 10r FIFO 0,13 0t0 65834743 pipe
conmon 1859180 root 11r REG 0,32 0 218643 /sys/fs/cgroup/memory/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod198479f2_b882_4adb_aacb_02e642699f88.slice/crio-03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6.scope/container/memory.oom_control
conmon 1859180 root 12r FIFO 0,24 0t0 65834750 /run/containers/storage/overlay-containers/03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6/userdata/ctl
conmon 1859180 root 13u unix 0xffff912ba5395a00 0t0 65834745 /proc/self/fd/12/attach type=SEQPACKET
conmon 1859180 root 14w FIFO 0,24 0t0 65834750 /run/containers/storage/overlay-containers/03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6/userdata/ctl
conmon 1859180 root 15r FIFO 0,24 0t0 65834751 /run/containers/storage/overlay-containers/03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6/userdata/winsz
conmon 1859180 root 16w FIFO 0,24 0t0 65834751 /run/containers/storage/overlay-containers/03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6/userdata/winsz
conmon 1859180 root 17u a_inode 0,14 0 10538 [signalfd]
conmon 1859180 root 18u a_inode 0,14 0 10538 [eventfd]
conmon 1859180 root 26u DIR 0,25 360 12491 /sys/fs/cgroup
@Leo1003, thank you for detailed report! I am sorry you are having issues.
When the Prometheus process gets "stuck", did you notice that state the process was into? Was it in "D" (uninterruptible sleep) state perhaps? Would it be possible that some underlying storage related issues are causing the process to wait for some slow I/O completion, etc?
There should be some logs that conmon itself sends to syslog with prefix "conmon:", check the /var/log/syslog or /var/log/messages files, perhaps there is something of note there.
Also, anything of note in "dmesg" output?
The underlying storage of Prometheus is NFS, which can lead to "D" (uninterruptible sleep) if there are network issues. However, in this event, the process state was in "Z" (zombie state). And there are no kernel messages when this event happened.
I think this bug is likely to be a kernel bug. But I don't know it is caused by RHEL kernel or upstream kernel.