Pods are crashing to pin and collect from specific CPU
hi Team,
We are getting below message for few of the nodes when we deploy it as a pod. Not sure why it is marking it as the core are offline. And not sure why the pods are getting crashed to start it.
Any help on this would be much appreciated.
pthread_setaffinity_np for core 5 failed with code 22
apiVersion: apps/v1 kind: DaemonSet metadata: name: intel-metrics-exporter namespace: ns-sy-rcp-observability labels: k8s-app: intel-metrics-exporter spec: selector: matchLabels: name: intel-metrics-exporter template: metadata: labels: name: intel-metrics-exporter spec: nodeSelector: kubernetes.io/os: linux hostNetwork: true containers: - name: intel-metrics-exporter image: intel-metrics-exporter:1.1.1 ports: - containerPort: 9738 hostPort: 9738 resources: limits: memory: 500Mi requests: cpu: 1000m memory: 50Mi securityContext: privileged: true
Regards, Arun Raj. R
@rdementi : Can you help us here on this issue ?
Hi, pcm requires all cores to be online. Are you running pcm in a restricted cgroup? If you remove restriction it might work.
@rdementi :
Yes you are correct. We are using --topology-manager-policy=restricted mode. As this is platform requirement, for our applications which requires dedicated CPU's. We cant remove it.
May i know why that is a blocker for the PCM ? How to handle this case ? any way forward would be appreciated.
Even working nodes also failing for one or two CPU's where we have dedicated / Guaranteed pods are running.
pcm needs to read CPU topology information from each core. It changes the affinity to every core to execute cpuid instruction on this core which reads the topology info of the core.
@rdementi : If i understood correctly.
PCM needs to change the affinity of the every core to execute the cpuid instructions on each and every core.
Since the application is already scheduled on the core, Either OS / cgroup / App is not letting the PCM to be schedule it here. Is it correct understanding ?
Can we get the meaning of the error 22 ? We would like to deep dive on the problem for closure.
Could you please share the full pcm output till the crash?
@rdementi :
kubectl logs intel-metrics-exporter-mznkj -n ns-sy-rcp-observability
===== Processor information ===== Linux arch_perfmon flag : yes Hybrid processor : no IBRS and IBPB supported : yes STIBP supported : yes Spec arch caps supported : yes Max CPUID level : 27 CPU family : 6 CPU model number : 106 ERROR: pthread_setaffinity_np for core 64 failed with code 22 Marking core 64 offline PCM ERROR. Exception ERROR: Core: thread_id 1 cannot be larger than 1.
void addHyperThreadInfo( int32 osID, TopologyEntry te ) {
if ( te.thread_id >= MAX_THREADS_PER_CORE ) {
std::stringstream ss;
ss << "ERROR: Core: thread_id " << te.thread_id << " cannot be larger than " << MAX_THREADS_PER_CORE << ".\n";
throw std::runtime_error( ss.str() );
}
Comes from here, I guess somehow the number of threads per core is wrongly calculated and set to 1 instead of 2. This is not a part without Hyperthreading or?
thanks for the output. This is because the HT sibling of core 0 is offlined. I have a fix now which will be pushed soon.
Actually there is no core is offline. All the cores are online at our end.
Core64 or sibling of Core0 is serving a guaranteed pod at our end. Can you suggest, how to get over this case which is predominant problem ?
I think the upcoming patch will help you.
this patch has been landed into the master branch now.
@rdementi : Is there any guide to construct image from the master branch ?
the image is built and uploaded automatically on every commit to dockerhub and ghcr: https://github.com/intel/pcm/pkgs/container/pcm Could you pull the latest?
@rdementi : POD crashing has been arrested. We could see all the pods are up and healthy.
Still the guranteed CPU's are not being monitored here, Can you help to suggest the way forward for the same ? Should I open another ticket for the same ?
` ===== Processor information ===== Linux arch_perfmon flag : yes Hybrid processor : no IBRS and IBPB supported : yes STIBP supported : yes Spec arch caps supported : yes Max CPUID level : 27 CPU family : 6 CPU model number : 106 ERROR: pthread_setaffinity_np for core 2 failed with code 22 Marking core 2 offline ERROR: pthread_setaffinity_np for core 6 failed with code 22 Marking core 6 offline ERROR: pthread_setaffinity_np for core 66 failed with code 22 Marking core 66 offline ERROR: pthread_setaffinity_np for core 70 failed with code 22 Marking core 70 offline Number of logical cores: 128 Number of online logical cores: 124 Threads (logical cores) per physical core: 2 (maybe imprecise due to core offlining/hybrid CPU) Offlined cores: 2 6 66 70 Num sockets: 2 Physical cores per socket: 32 (maybe imprecise due to core offlining/hybrid CPU) Last level cache slices per socket: 32 Core PMU (perfmon) version: 5 Number of core PMU generic (programmable) counters: 8 Width of generic (programmable) counters: 48 bits Number of core PMU fixed counters: 4 Width of fixed counters: 48 bits Nominal core frequency: 2200000000 Hz IBRS enabled in the kernel : yes STIBP enabled in the kernel : no The processor is not susceptible to Rogue Data Cache Load: yes The processor supports enhanced IBRS : yes Package thermal spec power: 185 Watt; Package minimum power: 84 Watt; Package maximum power: 652 Watt;
INFO: Linux perf interface to program uncore PMUs is NOT present Socket 0: 4 memory controllers detected with total number of 8 channels. 3 UPI ports detected. 4 M2M (mesh to memory)/B2CMI blocks detected. 0 HBM M2M blocks detected. 0 EDC/HBM channels detected. 0 Home Agents detected. 3 M3UPI/B2UPI blocks detected. Socket 1: 4 memory controllers detected with total number of 8 channels. 3 UPI ports detected. 4 M2M (mesh to memory)/B2CMI blocks detected. 0 HBM M2M blocks detected. 0 EDC/HBM channels detected. 0 Home Agents detected. 3 M3UPI/B2UPI blocks detected. Socket 0: 1 PCU units detected. 6 IIO units detected. 6 IRP units detected. 32 CHA/CBO units detected. 0 MDF units detected. 1 UBOX units detected. 0 CXL units detected. 0 PCIE_GEN5x16 units detected. 0 PCIE_GEN5x8 units detected. Socket 1: 1 PCU units detected. 6 IIO units detected. 6 IRP units detected. 32 CHA/CBO units detected. 0 MDF units detected. 1 UBOX units detected. 0 CXL units detected. 0 PCIE_GEN5x16 units detected. 0 PCIE_GEN5x8 units detected. Initializing RMIDs
Disabling NMI watchdog since it consumes one hw-PMU counter. To keep NMI watchdog set environment variable PCM_KEEP_NMI_WATCHDOG=1 (this reduces the core metrics set) Closed perf event handles Trying to use Linux perf events... Usage of Linux perf events is disabled through PCM_NO_PERF environment variable. Using direct PMU programming... Installed Linux kernel perf does not support hardware top-down level-1 counters. Using direct PMU programming instead. Socket 0 Max UPI link 0 speed: 25.1 GBytes/second (11.2 GT/second) Max UPI link 1 speed: 25.1 GBytes/second (11.2 GT/second) Max UPI link 2 speed: 25.1 GBytes/second (11.2 GT/second) Socket 1 Max UPI link 0 speed: 25.1 GBytes/second (11.2 GT/second) Max UPI link 1 speed: 25.1 GBytes/second (11.2 GT/second) Max UPI link 2 speed: 25.1 GBytes/second (11.2 GT/second) Starting plain HTTP server on http://localhost:9738/`