edge-manageability-framework icon indicating copy to clipboard operation
edge-manageability-framework copied to clipboard

[BUG-NOK] Orchestrator - Alerts - Host alerts like "Host CPU Temperature Exceeds Threshold" does not generate.

Open ss5829 opened this issue 5 months ago • 4 comments

Bug Description

Under Settings -> Alerts -> Host CPU Temperature Exceeds Threshold

in the web-ui it is configured as below (with value 0)

Image

for a particular host these are the Grafana details

Image Image

but still the "Host CPU Temperature Exceeds" Alerts are not generated

values 1, 5, 10 were tried. no corresponding Alerts were generated.

System Setup

Open Edge Orchestrator - v3.1.3

Reproducible Steps

please refer Bug Description

Root Cause Analysis

No response

ss5829 avatar Nov 27 '25 08:11 ss5829

please note that other Host related Alerts like High Disk Usage, High CPU usage, Network Utilization are not generated even when the thresholds are set to the lowest possible values.

This has been noted for lenovo laptops as well which were onboarded in different AWS instances.

other Deployment package related errors were observed as usual.

ss5829 avatar Dec 01 '25 06:12 ss5829

@ss5829 could you share logs from the alerting-monitor pod on the orchestrator from the time when the alerts are created in the web-ui please?

cjnolan avatar Dec 01 '25 11:12 cjnolan

@ss5829 I can see some errors related to authentication and token refresh in the alert-monitor logs, could you check if the service is able to retrieve a valid token and that it can update it when expired please?

cjnolan avatar Dec 02 '25 16:12 cjnolan

@cjnolan , following token was obtained from alerting-monitor-alertmanager-0 pod location /var/run/secrets/kubernetes.io/serviceaccount/token

the expiry date is shown as DEC 03rd 2026

Image

the same observation was made for pod alerting-monitor-6c86d9c485-8rxhf

ss5829 avatar Dec 03 '25 10:12 ss5829