Podman Stats - CPU Percent displays Average CPU when requesting a single output (no stream)
Issue Description
Describe your issue
Steps to reproduce the issue
Steps to reproduce the issue
- Ask for stats without stream about a container (podman stats my-container --no-stream)
- Ask for the same stats running the stream (podman stats my-container)
Describe the results you received
With no-stream option OR the first time the stats are displayed when running the stream, CPU% displays AVG CPU %.
In stream mode, after the 1st refresh, CPU% is displayed with the current real usage value.
Exemple :
First display / No stream :
6953b251bc9c gofast-ng-jitsi-videobridge 0.58% 393.7MB / 8.051GB 4.89% 173.8MB / 128.4MB 49.68MB / 232.6MB 69 10m2.724575s 0.58%
In stream mode, after the 1st refresh :
6953b251bc9c gofast-ng-jitsi-videobridge 11.90% 393.8MB / 8.051GB 4.89% 176MB / 129.6MB 49.68MB / 232.6MB 69 10m3.321834s 0.59%
This makes our current monitoring based on --no-stream option unreliable
Describe the results you expected
With the --no-stream option (Or at the 1st display), display the current CPU Usage
podman info output
host:
arch: amd64
buildahVersion: 1.37.6
cgroupControllers:
- cpuset
- cpu
- io
- memory
- hugetlb
- pids
- rdma
- misc
cgroupManager: systemd
cgroupVersion: v2
conmon:
package: conmon-2.1.12-1.el9.x86_64
path: /usr/bin/conmon
version: 'conmon version 2.1.12, commit: eb379dceb7efebd9a9d6b3349a57424d83483065'
cpuUtilization:
idlePercent: 97.96
systemPercent: 0.93
userPercent: 1.1
cpus: 4
databaseBackend: sqlite
distribution:
distribution: almalinux
version: "9.5"
eventLogger: journald
freeLocks: 2025
hostname: gofast-demo-comm.ceo-vision.com
idMappings:
gidmap: null
uidmap: null
kernel: 5.14.0-503.31.1.el9_5.x86_64
linkmode: dynamic
logDriver: journald
memFree: 1008455680
memTotal: 8050638848
networkBackend: netavark
networkBackendInfo:
backend: netavark
dns:
package: aardvark-dns-1.12.2-1.el9_5.x86_64
path: /usr/libexec/podman/aardvark-dns
version: aardvark-dns 1.12.2
package: netavark-1.12.2-1.el9.x86_64
path: /usr/libexec/podman/netavark
version: netavark 1.12.2
ociRuntime:
name: crun
package: crun-1.16.1-1.el9.x86_64
path: /usr/bin/crun
version: |-
crun version 1.16.1
commit: afa829ca0122bd5e1d67f1f38e6cc348027e3c32
rundir: /run/user/1000/crun
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
os: linux
pasta:
executable: /usr/bin/pasta
package: passt-0^20240806.gee36266-7.el9_5.x86_64
version: |
pasta 0^20240806.gee36266-7.el9_5.x86_64
Copyright Red Hat
GNU General Public License, version 2 or later
<https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
remoteSocket:
exists: false
path: /run/podman/podman.sock
rootlessNetworkCmd: pasta
security:
apparmorEnabled: false
capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
rootless: false
seccompEnabled: true
seccompProfilePath: /usr/share/containers/seccomp.json
selinuxEnabled: true
serviceIsRemote: false
slirp4netns:
executable: /usr/bin/slirp4netns
package: slirp4netns-1.3.1-1.el9.x86_64
version: |-
slirp4netns version 1.3.1
commit: e5e368c4f5db6ae75c2fce786e31eef9da6bf236
libslirp: 4.4.0
SLIRP_CONFIG_VERSION_MAX: 3
libseccomp: 2.5.2
swapFree: 8523522048
swapTotal: 8589930496
uptime: 149h 49m 16.00s (Approximately 6.21 days)
variant: ""
plugins:
authorization: null
log:
- k8s-file
- none
- passthrough
- journald
network:
- bridge
- macvlan
- ipvlan
volume:
- local
registries:
search:
- registry.access.redhat.com
- registry.redhat.io
- docker.io
store:
configFile: /etc/containers/storage.conf
containerStore:
number: 14
paused: 0
running: 12
stopped: 2
graphDriverName: overlay
graphOptions:
overlay.mountopt: nodev,metacopy=on
graphRoot: /var/lib/containers/storage
graphRootAllocated: 91200946176
graphRootUsed: 18238193664
graphStatus:
Backing Filesystem: xfs
Native Overlay Diff: "false"
Supports d_type: "true"
Supports shifting: "false"
Supports volatile: "true"
Using metacopy: "true"
imageCopyTmpDir: /var/tmp
imageStore:
number: 14
runRoot: /run/containers/storage
transientStore: false
volumePath: /var/lib/containers/storage/volumes
version:
APIVersion: 5.2.2
Built: 1738640782
BuiltTime: Tue Feb 4 04:46:22 2025
GitCommit: ""
GoVersion: go1.22.9 (Red Hat 1.22.9-2.el9_5)
Os: linux
OsArch: linux/amd64
Version: 5.2.2
Podman in a container
No
Privileged Or Rootless
None
Upstream Latest Release
No
Additional environment details
VM Running on Almalinux, latest podman package
Additional information
That is the expected behavior, there is no such thing as an average cpu (edit: avg cpu at a signle point in time I mean) usage really.
The way the percentage works is cpu time/real time. And when you request a single stat we don't have enough time to gather an accurate percentage as the time is not long enough. So to do it we use the container start time until now to compare, all steps after it use the interval time which makes it much more accurate.
The only option would make the no-stream option block 1s which clear seems undesirable and a waste of time.
@Luap99 Thank you for those explainations, yes sure I understand how this is calculated and the technical difficulties involved.
But we still have a wrong data that should give a proper estimate of "current" CPU usage, would it be a good idea to :
- Display -- at the first iterration of the stream
- Do not display cpu_percent with --no-stream option as this is not a calculable stat
- Have a blocking Xsec option to get the calculated cpu_percent in --no-stream mode (because I still have no solution to include this very/only useful stat about CPU usage in my monitoring)
Also do you know how AVG CPU % is calculated if we don't have data from before when starting to request the stats ?
EDIT : A last thing came to my mind after reading your explainations, is the calculation taking into account the # of CPUs ? Because if the calculation is cpu_time/real_time it should be devided by the # of CPUs right ?
Thank you in advance !
Display -- at the first iterration of the stream Do not display cpu_percent with --no-stream option as this is not a calculable stat
That could break whoever is depending on the existing data.
Have a blocking Xsec option to get the calculated cpu_percent in --no-stream mode (because I still have no solution to include this very/only useful stat about CPU usage in my monitoring)
That means the command takes significant longer. We might need to block 1s to get a accurate result.
Also do you know how AVG CPU % is calculated if we don't have data from before when starting to request the stats ?
podman stats takes it data from the cgroup that is configured for each container. That already tracks the total cpu time since we started the container. And then we also know when we started the container so we can calculate the percentage with that. AVG CPU will always be that, CPU will be that on the first tick and the then it will be the the percentage between each tick.
I do agree it is not the best behavior but I am not sure we can just change it either.
EDIT : A last thing came to my mind after reading your explainations, is the calculation taking into account the # of CPUs ? Because if the calculation is cpu_time/real_time it should be devided by the # of CPUs right ?
No the cores don't matter here, the percentage is based on a single core. If the container uses two cores max you get 200%. That is how cpu percentages are reported in linux commonly.
Display -- at the first iterration of the stream Do not display cpu_percent with --no-stream option as this is not a calculable stat
That could break whoever is depending on the existing data.
That would be a misunderstanding to use cpu_percent as I did, because the data are :
[PROD] [root@gofast-comm ceosupport]# podman stats gofast-ng-jitsi-videobridge --no-stream --format json
[
{
"id": "2f09d8af4f69",
"name": "gofast-ng-jitsi-videobridge",
"cpu_time": "15.461081s",
"cpu_percent": "0.63%",
"avg_cpu": "0.63%",
"mem_usage": "365.5MB / 16.49GB",
"mem_percent": "2.22%",
"net_io": "2.239MB / 2.055MB",
"block_io": "1.286MB / 9.966MB",
"pids": "57"
}
]
But yes I agree it would be better to not break usages in place
Have a blocking Xsec option to get the calculated cpu_percent in --no-stream mode (because I still have no solution to include this very/only useful stat about CPU usage in my monitoring)
That means the command takes significant longer. We might need to block 1s to get a accurate result.
Yes that's true, but understandable with a parameter if this is the only way to get a usable cpu_percent data for a monitoring system. Do you think of another way to get this information programmatically ?
A friendly reminder that this issue had no activity for 30 days.