conmon conmon leave main container process in zombie state

I observed a strange behavior. We run Prometheus on our Kuberentes, however, it usually gets stuck when Kubernetes restart the container.

CRI-O logs:

Mar 27 22:35:38 k8s-sys-w1 crio[1364]: time="2024-03-27 22:35:38.600423205+08:00" level=info msg="Stopping container: 03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6 (timeout: 600s)" id=c60a252b-7eb3-4337-96aa-398fb115db16 name=/runtime.v1.RuntimeService/StopContainer
Mar 27 22:45:38 k8s-sys-w1 crio[1364]: time="2024-03-27 22:45:38.615546831+08:00" level=warning msg="Stopping container 03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6 with stop signal timed out: timeout reached after 600 seconds waiting for container process to exit" id=c60a252b-7eb3-4337-96aa-398fb115db16 name=/runtime.v1.RuntimeService/StopContainer
Mar 27 22:47:38 k8s-sys-w1 crio[1364]: time="2024-03-27 22:47:38.945321438+08:00" level=info msg="Stopping container: 03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6 (timeout: 600s)" id=eb4ce11f-99d2-40d4-8048-68a9f918a943 name=/runtime.v1.RuntimeService/StopContainer
Mar 27 22:57:38 k8s-sys-w1 crio[1364]: time="2024-03-27 22:57:38.959805291+08:00" level=warning msg="Stopping container 03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6 with stop signal timed out: timeout reached after 600 seconds waiting for container process to exit" id=eb4ce11f-99d2-40d4-8048-68a9f918a943 name=/runtime.v1.RuntimeService/StopContainer

The prometheus process should be the PID 1 in the PID namespace, after it died, the whole namespace should be killed by kernel.

However, the conmon leave the prometheus process in zombie state. Thus, the container get stuck.

$ sudo pstree -plTS 1859180
conmon(1859180)───prometheus(1859182,pid)
$ ps ufS 1859180 1859182
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     1859180  0.0  0.0  70152  2284 ?        Ss   Mar27   0:00 /usr/bin/conmon <omitted...>
ansible  1859182  0.0  0.0      0     0 ?        Zsl  Mar27   0:00  \_ [prometheus] <defunct>

I tried to use strace to see what conmon is doing, and sending some SIGCHLD signal in another terminal.

$ sudo strace -p 1859180
strace: Process 1859180 attached
restart_syscall(<... resuming interrupted restart_syscall ...>) = 1
write(5, "\1\0\0\0\0\0\0\0", 8)         = 8
read(17, "\21\0\0\0\0\0\0\0\0\0\0\0\\\352(\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 128) = 128
wait4(-1, 0x7ffec7059b70, WNOHANG, NULL) = 0
write(5, "\1\0\0\0\0\0\0\0", 8)         = 8
poll([{fd=5, events=POLLIN}, {fd=8, events=POLLIN}, {fd=10, events=POLLIN}, {fd=12, events=POLLIN}, {fd=13, events=POLLIN}, {fd=17, events=POLLIN}, {fd=18, events=POLLIN}], 7, -1) = 1 ([{fd=5, revents=POLLIN}])
read(5, "\2\0\0\0\0\0\0\0", 16)         = 8
poll([{fd=5, events=POLLIN}, {fd=8, events=POLLIN}, {fd=10, events=POLLIN}, {fd=12, events=POLLIN}, {fd=13, events=POLLIN}, {fd=17, events=POLLIN}, {fd=18, events=POLLIN}], 7, -1) = 1 ([{fd=17, revents=POLLIN}])
write(5, "\1\0\0\0\0\0\0\0", 8)         = 8
read(17, "\21\0\0\0\0\0\0\0\0\0\0\0>\3)\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 128) = 128
wait4(-1, 0x7ffec7059b70, WNOHANG, NULL) = 0
write(5, "\1\0\0\0\0\0\0\0", 8)         = 8
poll([{fd=5, events=POLLIN}, {fd=8, events=POLLIN}, {fd=10, events=POLLIN}, {fd=12, events=POLLIN}, {fd=13, events=POLLIN}, {fd=17, events=POLLIN}, {fd=18, events=POLLIN}], 7, -1) = 1 ([{fd=5, revents=POLLIN}])
read(5, "\2\0\0\0\0\0\0\0", 16)         = 8
poll([{fd=5, events=POLLIN}, {fd=8, events=POLLIN}, {fd=10, events=POLLIN}, {fd=12, events=POLLIN}, {fd=13, events=POLLIN}, {fd=17, events=POLLIN}, {fd=18, events=POLLIN}], 7, -1

Although conmon did call wait4(), however the kernel return with 0. (meaning no process can be waited)

Some system information:

$ crun --version
crun version 1.9.2
commit: 35274d346d2e9ffeacb22cc11590b0266a23d634
rundir: /run/user/17247/crun
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
$ conmon --version
conmon version 2.1.8
commit: 97585789fa36b1cfaf71f2d23b8add1a41f95f50
$ crictl -v
crictl version v1.26.0
$ uname -a
Linux k8s-sys-w1 4.18.0-526.el8.x86_64 #1 SMP Sat Nov 18 00:54:11 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Mar 29 '24 10:03 Leo1003

can you show the pod spec that encounters this? I also wonder if you can try with https://github.com/cri-o/cri-o/pull/7910 which I think will at least cause cri-o not to spit out that error

Mar 29 '24 18:03 haircommander

The pod spec of the Prometheus instance

apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubectl.kubernetes.io/default-container: prometheus
  generateName: prometheus-k8s-
  labels:
    app.kubernetes.io/component: prometheus
    app.kubernetes.io/instance: k8s
    app.kubernetes.io/managed-by: prometheus-operator
    app.kubernetes.io/name: prometheus
    app.kubernetes.io/part-of: kube-prometheus
    app.kubernetes.io/version: 2.46.0
    controller-revision-hash: prometheus-k8s-6d767649c4
    operator.prometheus.io/name: k8s
    operator.prometheus.io/shard: "0"
    prometheus: k8s
    statefulset.kubernetes.io/pod-name: prometheus-k8s-0
  name: prometheus-k8s-0
  namespace: monitoring
spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          labelSelector:
            matchLabels:
              app.kubernetes.io/component: prometheus
              app.kubernetes.io/instance: k8s
              app.kubernetes.io/name: prometheus
              app.kubernetes.io/part-of: kube-prometheus
          namespaces:
          - monitoring
          topologyKey: kubernetes.io/hostname
        weight: 100
  automountServiceAccountToken: true
  containers:
  - args:
    - --web.console.templates=/etc/prometheus/consoles
    - --web.console.libraries=/etc/prometheus/console_libraries
    - --config.file=/etc/prometheus/config_out/prometheus.env.yaml
    - --web.enable-lifecycle
    - --web.route-prefix=/
    - --storage.tsdb.retention.time=30d
    - --storage.tsdb.retention.size=100GiB
    - --storage.tsdb.path=/prometheus
    - --storage.tsdb.wal-compression
    - --web.config.file=/etc/prometheus/web_config/web-config.yaml
    - --storage.tsdb.max-block-duration=2h
    - --storage.tsdb.min-block-duration=2h
    image: quay.io/prometheus/prometheus:v2.46.0
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 6
      httpGet:
        path: /-/healthy
        port: web
        scheme: HTTP
      periodSeconds: 5
      successThreshold: 1
      timeoutSeconds: 3
    name: prometheus
    ports:
    - containerPort: 9090
      name: web
      protocol: TCP
    readinessProbe:
      failureThreshold: 3
      httpGet:
        path: /-/ready
        port: web
        scheme: HTTP
      periodSeconds: 5
      successThreshold: 1
      timeoutSeconds: 3
    resources:
      requests:
        memory: 400Mi
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
      readOnlyRootFilesystem: true
    startupProbe:
      failureThreshold: 60
      httpGet:
        path: /-/ready
        port: web
        scheme: HTTP
      periodSeconds: 15
      successThreshold: 1
      timeoutSeconds: 3
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: FallbackToLogsOnError
    volumeMounts:
    - mountPath: /etc/prometheus/config_out
      name: config-out
      readOnly: true
    - mountPath: /etc/prometheus/certs
      name: tls-assets
      readOnly: true
    - mountPath: /prometheus
      name: prometheus-k8s-db
      subPath: prometheus-db
    - mountPath: /etc/prometheus/rules/prometheus-k8s-rulefiles-0
      name: prometheus-k8s-rulefiles-0
    - mountPath: /etc/prometheus/web_config/web-config.yaml
      name: web-config
      readOnly: true
      subPath: web-config.yaml
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-8fjp4
      readOnly: true
  - args:
    - --listen-address=:8080
    - --reload-url=http://localhost:9090/-/reload
    - --config-file=/etc/prometheus/config/prometheus.yaml.gz
    - --config-envsubst-file=/etc/prometheus/config_out/prometheus.env.yaml
    - --watched-dir=/etc/prometheus/rules/prometheus-k8s-rulefiles-0
    command:
    - /bin/prometheus-config-reloader
    env:
    - name: POD_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.name
    - name: SHARD
      value: "0"
    image: quay.io/prometheus-operator/prometheus-config-reloader:v0.67.1
    imagePullPolicy: IfNotPresent
    name: config-reloader
    ports:
    - containerPort: 8080
      name: reloader-web
      protocol: TCP
    resources:
      limits:
        cpu: 10m
        memory: 50Mi
      requests:
        cpu: 10m
        memory: 50Mi
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
      readOnlyRootFilesystem: true
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: FallbackToLogsOnError
    volumeMounts:
    - mountPath: /etc/prometheus/config
      name: config
    - mountPath: /etc/prometheus/config_out
      name: config-out
    - mountPath: /etc/prometheus/rules/prometheus-k8s-rulefiles-0
      name: prometheus-k8s-rulefiles-0
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-8fjp4
      readOnly: true
  - args:
    - sidecar
    - --prometheus.url=http://localhost:9090/
    - '--prometheus.http-client={"tls_config": {"insecure_skip_verify":true}}'
    - --grpc-address=:10901
    - --http-address=:10902
    - --objstore.config=$(OBJSTORE_CONFIG)
    - --tsdb.path=/prometheus
    env:
    - name: OBJSTORE_CONFIG
      valueFrom:
        secretKeyRef:
          key: bucket.yaml
          name: thanos-sidecar-objectstorage
    image: quay.io/thanos/thanos:v0.31.0
    imagePullPolicy: IfNotPresent
    name: thanos-sidecar
    ports:
    - containerPort: 10902
      name: http
      protocol: TCP
    - containerPort: 10901
      name: grpc
      protocol: TCP
    resources: {}
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
      readOnlyRootFilesystem: true
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: FallbackToLogsOnError
    volumeMounts:
    - mountPath: /prometheus
      name: prometheus-k8s-db
      subPath: prometheus-db
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-8fjp4
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  hostname: prometheus-k8s-0
  initContainers:
  - args:
    - --watch-interval=0
    - --listen-address=:8080
    - --config-file=/etc/prometheus/config/prometheus.yaml.gz
    - --config-envsubst-file=/etc/prometheus/config_out/prometheus.env.yaml
    - --watched-dir=/etc/prometheus/rules/prometheus-k8s-rulefiles-0
    command:
    - /bin/prometheus-config-reloader
    env:
    - name: POD_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.name
    - name: SHARD
      value: "0"
    image: quay.io/prometheus-operator/prometheus-config-reloader:v0.67.1
    imagePullPolicy: IfNotPresent
    name: init-config-reloader
    ports:
    - containerPort: 8080
      name: reloader-web
      protocol: TCP
    resources:
      limits:
        cpu: 10m
        memory: 50Mi
      requests:
        cpu: 10m
        memory: 50Mi
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
      readOnlyRootFilesystem: true
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: FallbackToLogsOnError
    volumeMounts:
    - mountPath: /etc/prometheus/config
      name: config
    - mountPath: /etc/prometheus/config_out
      name: config-out
    - mountPath: /etc/prometheus/rules/prometheus-k8s-rulefiles-0
      name: prometheus-k8s-rulefiles-0
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-8fjp4
      readOnly: true
  nodeSelector:
    kubernetes.io/os: linux
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext:
    fsGroup: 2000
    runAsNonRoot: true
    runAsUser: 1000
  serviceAccount: prometheus-k8s
  serviceAccountName: prometheus-k8s
  subdomain: prometheus-operated
  terminationGracePeriodSeconds: 600
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: prometheus-k8s-db
    persistentVolumeClaim:
      claimName: prometheus-k8s-db-prometheus-k8s-0
  - name: config
    secret:
      defaultMode: 420
      secretName: prometheus-k8s
  - name: tls-assets
    projected:
      defaultMode: 420
      sources:
      - secret:
          name: prometheus-k8s-tls-assets-0
  - emptyDir:
      medium: Memory
    name: config-out
  - configMap:
      defaultMode: 420
      name: prometheus-k8s-rulefiles-0
    name: prometheus-k8s-rulefiles-0
  - name: web-config
    secret:
      defaultMode: 420
      secretName: prometheus-k8s-web-config
  - name: kube-api-access-8fjp4
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace

The Prometheus container got restart by hitting the liveness probe (This is caused by our performance issues). Sometimes, it stucks into the strange state above, making the pod into unready state and trigger the alarm. The situation has occurred few times before I report this issues.

Mar 30 '24 09:03 Leo1003

I found that the stuck prometheus finally be caught by conmon in the last night. time="2024-03-29 17:32:41.616478684+08:00" level=info msg="Removed container 03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6: monitoring/prometheus-k8s-0/prometheus" id=648d7266-9fdd-4961-8f04-71e47ebd791f name=/runtime.v1.RuntimeService/RemoveContainer

And the strace I left running in tmux recorded the what conmon did:

[continue from the strace above...] ) = 2 ([{fd=8, revents=POLLHUP}, {fd=10, revents=POLLHUP}])
write(5, "\1\0\0\0\0\0\0\0", 8)         = 8
close(8)                                = 0
write(5, "\1\0\0\0\0\0\0\0", 8)         = 8
write(5, "\1\0\0\0\0\0\0\0", 8)         = 8
write(5, "\1\0\0\0\0\0\0\0", 8)         = 8
close(10)                               = 0
write(5, "\1\0\0\0\0\0\0\0", 8)         = 8
write(5, "\1\0\0\0\0\0\0\0", 8)         = 8
poll([{fd=5, events=POLLIN}, {fd=12, events=POLLIN}, {fd=13, events=POLLIN}, {fd=17, events=POLLIN}, {fd=18, events=POLLIN}], 5, -1) = 1 ([{fd=5, revents=POLLIN}])
read(5, "\6\0\0\0\0\0\0\0", 16)         = 8
poll([{fd=5, events=POLLIN}, {fd=12, events=POLLIN}, {fd=13, events=POLLIN}, {fd=17, events=POLLIN}, {fd=18, events=POLLIN}], 5, -1) = 1 ([{fd=17, revents=POLLIN}])
write(5, "\1\0\0\0\0\0\0\0", 8)         = 8
read(17, "\21\0\0\0\0\0\0\0\2\0\0\0n^\34\0\350\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 128) = 128
wait4(-1, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGKILL}], WNOHANG, NULL) = 1859182
write(2, "[conmon:i]: container 1859182 ex"..., 53) = 53
socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 8
connect(8, {sa_family=AF_UNIX, sun_path="/dev/log"}, 110) = 0
sendto(8, "<14>Mar 29 18:16:48 conmon: conm"..., 107, MSG_NOSIGNAL, NULL, 0) = 107
write(5, "\1\0\0\0\0\0\0\0", 8)         = 8
futex(0x55c164181bf0, FUTEX_WAKE_PRIVATE, 2147483647) = 0
wait4(-1, 0x7ffec7059b70, WNOHANG, NULL) = -1 ECHILD (No child processes)
write(5, "\1\0\0\0\0\0\0\0", 8)         = 8
futex(0x55c164181bf0, FUTEX_WAKE_PRIVATE, 2147483647) = 0
write(5, "\1\0\0\0\0\0\0\0", 8)         = 8
fsync(6)                                = 0
write(5, "\1\0\0\0\0\0\0\0", 8)         = 8
close(17)                               = 0
close(4)                                = 0
close(26)                               = 0
openat(AT_FDCWD, "/var/lib/containers/storage/overlay-containers/03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6/userdata/exit.IUZIL2", O_RDWR|O_CREAT|O_EXCL, 0666) = -1 ENOENT (No such file or directory)
futex(0x7ff5ce103f78, FUTEX_WAKE_PRIVATE, 2147483647) = 0
futex(0x7ff5ce103f78, FUTEX_WAKE_PRIVATE, 2147483647) = 0
write(2, "[conmon:e]: Failed to write 137 "..., 244) = 244
sendto(8, "<11>Mar 29 18:16:48 conmon: conm"..., 298, MSG_NOSIGNAL, NULL, 0) = 298
unlink("/var/run/crio/03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6") = 0
wait4(-1, NULL, WNOHANG, NULL)          = -1 ECHILD (No child processes)
exit_group(1)                           = ?
+++ exited with 1 +++

P.s. I found that I forgot to provide the lsof information of the conmon process.

$ sudo lsof -p 1859180
COMMAND     PID USER   FD      TYPE             DEVICE SIZE/OFF      NODE NAME
conmon  1859180 root  cwd       DIR               0,24      200  65836199 /run/containers/storage/overlay-containers/03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6/userdata
conmon  1859180 root  rtd       DIR              253,0      244       128 /
conmon  1859180 root  txt       REG              253,0   160080    521925 /usr/bin/conmon
conmon  1859180 root  mem       REG              253,0   543304  33565316 /usr/lib64/libpcre2-8.so.0.7.1
conmon  1859180 root  mem       REG              253,0    37240  33572838 /usr/lib64/libffi.so.6.0.2
conmon  1859180 root  mem       REG              253,0   145984  33571896 /usr/lib64/libgpg-error.so.0.24.2
conmon  1859180 root  mem       REG              253,0   172104  33565321 /usr/lib64/libselinux.so.1
conmon  1859180 root  mem       REG              253,0    33480  33572814 /usr/lib64/libuuid.so.1.3.0
conmon  1859180 root  mem       REG              253,0   343624  33857491 /usr/lib64/libblkid.so.1.1.0
conmon  1859180 root  mem       REG              253,0  1541856  33572802 /usr/lib64/libgmp.so.10.3.2
conmon  1859180 root  mem       REG              253,0   197728  33836053 /usr/lib64/libhogweed.so.4.5
conmon  1859180 root  mem       REG              253,0   239360  33836055 /usr/lib64/libnettle.so.6.5
conmon  1859180 root  mem       REG              253,0    78816  33835982 /usr/lib64/libtasn1.so.6.5.5
conmon  1859180 root  mem       REG              253,0    19128  34072407 /usr/lib64/libdl-2.28.so
conmon  1859180 root  mem       REG              253,0  1805368  33572015 /usr/lib64/libunistring.so.2.1.0
conmon  1859180 root  mem       REG              253,0   165624  33572822 /usr/lib64/libidn2.so.0.3.6
conmon  1859180 root  mem       REG              253,0  1246520  33572842 /usr/lib64/libp11-kit.so.0.3.0
conmon  1859180 root  mem       REG              253,0  1187312  33572817 /usr/lib64/libgcrypt.so.20.2.5
conmon  1859180 root  mem       REG              253,0   371384  33571984 /usr/lib64/libmount.so.1.1.0
conmon  1859180 root  mem       REG              253,0    33752  33571882 /usr/lib64/libcap.so.2.48
conmon  1859180 root  mem       REG              253,0   119760  33835813 /usr/lib64/liblz4.so.1.8.3
conmon  1859180 root  mem       REG              253,0   162192  33571872 /usr/lib64/liblzma.so.5.2.4
conmon  1859180 root  mem       REG              253,0    42744  34072419 /usr/lib64/librt-2.28.so
conmon  1859180 root  mem       REG              253,0   149984  34072415 /usr/lib64/libpthread-2.28.so
conmon  1859180 root  mem       REG              253,0   464936  33835930 /usr/lib64/libpcre.so.1.2.10
conmon  1859180 root  mem       REG              253,0  2051656  33839644 /usr/lib64/libgnutls.so.30.28.2
conmon  1859180 root  mem       REG              253,0  2089936  33566000 /usr/lib64/libc-2.28.so
conmon  1859180 root  mem       REG              253,0    99672  33591756 /usr/lib64/libgcc_s-8-20210514.so.1
conmon  1859180 root  mem       REG              253,0  1387696  33857508 /usr/lib64/libsystemd.so.0.23.0
conmon  1859180 root  mem       REG              253,0  1172024  33839626 /usr/lib64/libglib-2.0.so.0.5600.4
conmon  1859180 root  mem       REG              253,0  1062344  33565985 /usr/lib64/ld-2.28.so
conmon  1859180 root  mem       REG              253,0    26998      2987 /usr/lib64/gconv/gconv-modules.cache
conmon  1859180 root    0r      CHR                1,3      0t0      1027 /dev/null
conmon  1859180 root    1w      CHR                1,3      0t0      1027 /dev/null
conmon  1859180 root    2w      CHR                1,3      0t0      1027 /dev/null
conmon  1859180 root    3u     unix 0xffff912ba5390d80      0t0  65834737 type=STREAM
conmon  1859180 root    4r      CHR                1,3      0t0      1027 /dev/null
conmon  1859180 root    5u  a_inode               0,14        0     10538 [eventfd]
conmon  1859180 root    6w      REG              253,0     5702 102307849 /var/log/pods/monitoring_prometheus-k8s-0_198479f2-b882-4adb-aacb-02e642699f88/prometheus/1.log (deleted)
conmon  1859180 root    7w      CHR                1,3      0t0      1027 /dev/null
conmon  1859180 root    8r     FIFO               0,13      0t0  65834742 pipe
conmon  1859180 root   10r     FIFO               0,13      0t0  65834743 pipe
conmon  1859180 root   11r      REG               0,32        0    218643 /sys/fs/cgroup/memory/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod198479f2_b882_4adb_aacb_02e642699f88.slice/crio-03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6.scope/container/memory.oom_control
conmon  1859180 root   12r     FIFO               0,24      0t0  65834750 /run/containers/storage/overlay-containers/03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6/userdata/ctl
conmon  1859180 root   13u     unix 0xffff912ba5395a00      0t0  65834745 /proc/self/fd/12/attach type=SEQPACKET
conmon  1859180 root   14w     FIFO               0,24      0t0  65834750 /run/containers/storage/overlay-containers/03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6/userdata/ctl
conmon  1859180 root   15r     FIFO               0,24      0t0  65834751 /run/containers/storage/overlay-containers/03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6/userdata/winsz
conmon  1859180 root   16w     FIFO               0,24      0t0  65834751 /run/containers/storage/overlay-containers/03d1a6981d1eaa40c200f4555cd41f66e2567fc7bd7573dc979af67e22ce3bc6/userdata/winsz
conmon  1859180 root   17u  a_inode               0,14        0     10538 [signalfd]
conmon  1859180 root   18u  a_inode               0,14        0     10538 [eventfd]
conmon  1859180 root   26u      DIR               0,25      360     12491 /sys/fs/cgroup

Mar 30 '24 09:03 Leo1003

@Leo1003, thank you for detailed report! I am sorry you are having issues.

When the Prometheus process gets "stuck", did you notice that state the process was into? Was it in "D" (uninterruptible sleep) state perhaps? Would it be possible that some underlying storage related issues are causing the process to wait for some slow I/O completion, etc?

There should be some logs that conmon itself sends to syslog with prefix "conmon:", check the /var/log/syslog or /var/log/messages files, perhaps there is something of note there.

Also, anything of note in "dmesg" output?

Apr 02 '24 06:04 kwilczynski

The underlying storage of Prometheus is NFS, which can lead to "D" (uninterruptible sleep) if there are network issues. However, in this event, the process state was in "Z" (zombie state). And there are no kernel messages when this event happened.

I think this bug is likely to be a kernel bug. But I don't know it is caused by RHEL kernel or upstream kernel.

Apr 04 '24 11:04 Leo1003