podman icon indicating copy to clipboard operation
podman copied to clipboard

Podman Stats - CPU Percent displays Average CPU when requesting a single output (no stream)

Open jlemangarin opened this issue 1 year ago • 4 comments

Issue Description

Describe your issue

Steps to reproduce the issue

Steps to reproduce the issue

  1. Ask for stats without stream about a container (podman stats my-container --no-stream)
  2. Ask for the same stats running the stream (podman stats my-container)

Describe the results you received

With no-stream option OR the first time the stats are displayed when running the stream, CPU% displays AVG CPU %.

In stream mode, after the 1st refresh, CPU% is displayed with the current real usage value.

Exemple :

First display / No stream : 6953b251bc9c gofast-ng-jitsi-videobridge 0.58% 393.7MB / 8.051GB 4.89% 173.8MB / 128.4MB 49.68MB / 232.6MB 69 10m2.724575s 0.58%

In stream mode, after the 1st refresh : 6953b251bc9c gofast-ng-jitsi-videobridge 11.90% 393.8MB / 8.051GB 4.89% 176MB / 129.6MB 49.68MB / 232.6MB 69 10m3.321834s 0.59%

This makes our current monitoring based on --no-stream option unreliable

Describe the results you expected

With the --no-stream option (Or at the 1st display), display the current CPU Usage

podman info output

host:
  arch: amd64
  buildahVersion: 1.37.6
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - hugetlb
  - pids
  - rdma
  - misc
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.12-1.el9.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.12, commit: eb379dceb7efebd9a9d6b3349a57424d83483065'
  cpuUtilization:
    idlePercent: 97.96
    systemPercent: 0.93
    userPercent: 1.1
  cpus: 4
  databaseBackend: sqlite
  distribution:
    distribution: almalinux
    version: "9.5"
  eventLogger: journald
  freeLocks: 2025
  hostname: gofast-demo-comm.ceo-vision.com
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.14.0-503.31.1.el9_5.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 1008455680
  memTotal: 8050638848
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.12.2-1.el9_5.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.12.2
    package: netavark-1.12.2-1.el9.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.12.2
  ociRuntime:
    name: crun
    package: crun-1.16.1-1.el9.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.16.1
      commit: afa829ca0122bd5e1d67f1f38e6cc348027e3c32
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20240806.gee36266-7.el9_5.x86_64
    version: |
      pasta 0^20240806.gee36266-7.el9_5.x86_64
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: false
    path: /run/podman/podman.sock
  rootlessNetworkCmd: pasta
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.3.1-1.el9.x86_64
    version: |-
      slirp4netns version 1.3.1
      commit: e5e368c4f5db6ae75c2fce786e31eef9da6bf236
      libslirp: 4.4.0
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.2
  swapFree: 8523522048
  swapTotal: 8589930496
  uptime: 149h 49m 16.00s (Approximately 6.21 days)
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.access.redhat.com
  - registry.redhat.io
  - docker.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 14
    paused: 0
    running: 12
    stopped: 2
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphRootAllocated: 91200946176
  graphRootUsed: 18238193664
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "true"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 14
  runRoot: /run/containers/storage
  transientStore: false
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 5.2.2
  Built: 1738640782
  BuiltTime: Tue Feb  4 04:46:22 2025
  GitCommit: ""
  GoVersion: go1.22.9 (Red Hat 1.22.9-2.el9_5)
  Os: linux
  OsArch: linux/amd64
  Version: 5.2.2

Podman in a container

No

Privileged Or Rootless

None

Upstream Latest Release

No

Additional environment details

VM Running on Almalinux, latest podman package

Additional information

Image

jlemangarin avatar Mar 27 '25 17:03 jlemangarin

That is the expected behavior, there is no such thing as an average cpu (edit: avg cpu at a signle point in time I mean) usage really. The way the percentage works is cpu time/real time. And when you request a single stat we don't have enough time to gather an accurate percentage as the time is not long enough. So to do it we use the container start time until now to compare, all steps after it use the interval time which makes it much more accurate.

The only option would make the no-stream option block 1s which clear seems undesirable and a waste of time.

Luap99 avatar Mar 27 '25 18:03 Luap99

@Luap99 Thank you for those explainations, yes sure I understand how this is calculated and the technical difficulties involved.

But we still have a wrong data that should give a proper estimate of "current" CPU usage, would it be a good idea to :

  • Display -- at the first iterration of the stream
  • Do not display cpu_percent with --no-stream option as this is not a calculable stat
  • Have a blocking Xsec option to get the calculated cpu_percent in --no-stream mode (because I still have no solution to include this very/only useful stat about CPU usage in my monitoring)

Also do you know how AVG CPU % is calculated if we don't have data from before when starting to request the stats ?

EDIT : A last thing came to my mind after reading your explainations, is the calculation taking into account the # of CPUs ? Because if the calculation is cpu_time/real_time it should be devided by the # of CPUs right ?

Thank you in advance !

jlemangarin avatar Mar 28 '25 09:03 jlemangarin

Display -- at the first iterration of the stream Do not display cpu_percent with --no-stream option as this is not a calculable stat

That could break whoever is depending on the existing data.

Have a blocking Xsec option to get the calculated cpu_percent in --no-stream mode (because I still have no solution to include this very/only useful stat about CPU usage in my monitoring)

That means the command takes significant longer. We might need to block 1s to get a accurate result.

Also do you know how AVG CPU % is calculated if we don't have data from before when starting to request the stats ?

podman stats takes it data from the cgroup that is configured for each container. That already tracks the total cpu time since we started the container. And then we also know when we started the container so we can calculate the percentage with that. AVG CPU will always be that, CPU will be that on the first tick and the then it will be the the percentage between each tick.

I do agree it is not the best behavior but I am not sure we can just change it either.

EDIT : A last thing came to my mind after reading your explainations, is the calculation taking into account the # of CPUs ? Because if the calculation is cpu_time/real_time it should be devided by the # of CPUs right ?

No the cores don't matter here, the percentage is based on a single core. If the container uses two cores max you get 200%. That is how cpu percentages are reported in linux commonly.

Luap99 avatar Mar 28 '25 10:03 Luap99

Display -- at the first iterration of the stream Do not display cpu_percent with --no-stream option as this is not a calculable stat

That could break whoever is depending on the existing data.

That would be a misunderstanding to use cpu_percent as I did, because the data are :

[PROD] [root@gofast-comm ceosupport]# podman stats gofast-ng-jitsi-videobridge --no-stream --format json
[
 {
  "id": "2f09d8af4f69",
  "name": "gofast-ng-jitsi-videobridge",
  "cpu_time": "15.461081s",
  "cpu_percent": "0.63%",
  "avg_cpu": "0.63%",
  "mem_usage": "365.5MB / 16.49GB",
  "mem_percent": "2.22%",
  "net_io": "2.239MB / 2.055MB",
  "block_io": "1.286MB / 9.966MB",
  "pids": "57"
 }
]

But yes I agree it would be better to not break usages in place

Have a blocking Xsec option to get the calculated cpu_percent in --no-stream mode (because I still have no solution to include this very/only useful stat about CPU usage in my monitoring)

That means the command takes significant longer. We might need to block 1s to get a accurate result.

Yes that's true, but understandable with a parameter if this is the only way to get a usable cpu_percent data for a monitoring system. Do you think of another way to get this information programmatically ?

jlemangarin avatar Mar 28 '25 19:03 jlemangarin

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] avatar Apr 28 '25 00:04 github-actions[bot]