icinga-powershell-framework icon indicating copy to clipboard operation
icinga-powershell-framework copied to clipboard

Powershell Framework Failing to Query \Processor Information(*)\% Processor Utility Counter

Open WilHatesComputers opened this issue 1 year ago • 9 comments

I've outlined the problem thoroughly here . I fear the verbosity of my testing and details may be why the post got no attention so I'll be as brief as I can.

Environment (this is a little mixed across hosts):

agent 2.14.0 2.14.3 framework 1.11.0 1.12.3 plugins 1.11.0 1.12.0 service 1.2.0 1.2.0

When using invoke-icingaCheckCPU on numerous computers, we're intermittently seeing this error:

Plugin Output
[UNKNOWN]: Icinga Invalid Input Error was thrown: PerformanceCounter: \Processor Information(*)\% Processor Utility
A plugin failed to fetch Performance Counter information. Please ensure the counter is written properly and available on your system.

This is the result 95% of the time with an occasional valid result. When using anything from within Icinga to query this counter, it fails (invoke-icingacheckcpu, invoke-icingacheckperfcounter, etc.). I think I have tracked the cause down to when Icinga builds the list of counters available on the system (Line 95 of New-IcingaPerformanceCounter.psm1). For whatever reason "Processor Information" isn't present in [System.Diagnostics.PerformanceCounterCategory]:

PS C:\WINDOWS\system32> [Diagnostics.PerformanceCounterCategory]('Processor Information') | fl
CategoryName : Processor Information
CategoryHelp :
CategoryType :
MachineName  : .
PS C:\WINDOWS\system32> [Diagnostics.PerformanceCounterCategory]('Processor') | fl
CategoryName : Processor
CategoryHelp : The Processor performance object consists of counters that measure aspects of processor activity. The processor is the part of the computer that performs arithmetic and logical computations, initiates operations on peripherals, and runs the threads of processes.  A computer can have multiple processors. The processor object represents each processor as an instance of the object.
CategoryType : MultiInstance
MachineName  : .

Considering this, the counter can be queried with WMIC and get-counter without fail. I just can't use any Icinga cmdlets to work.

I have a very loose understanding of things this far under the hood. My investigation has led me to the conclusion this is some sort of .NET API object and Powershell just has direct access to it. I can't think of what could be happening to this counter that seems to be removing it from PerformanceCounterCategory. I'd think there would be considerable traction behind an issue like this if it affected everyone.

I opened this issue here instead of on the plugins project as it looks like the Framework is responsible for this output, not the plugin.

Even if this isn't an Icinga issue, if you can point me in the right direction, I'd really appreciate it. Many of the affected hosts are having this issue out of box (Almost, we install ScreenConnect, NinjaRMM, the Icinga Agent, NSCP, and a couple other productivity apps).

WilHatesComputers avatar Dec 17 '24 18:12 WilHatesComputers

I may be reaching but this issue report on the plugins project seems similar to mine, and even references the same "_Total" counter instance which is what Invoke-IcingaCheckCPU queries.

However, this user is looking to ignore the missing counter's UNKNOWN status using the -IgnoreEmptyChecks option.

I wanted to also add that I have tried rebuilding counters with lodctr, ran updates, rebooted, and reinstalled the agent, framework, and plugins. The only one of these that makes an impact is rebooting which only seems to resolve the issue for what I believe to be a single query.

WilHatesComputers avatar Dec 17 '24 18:12 WilHatesComputers

Thank you for the issue. This can happen from time to time sadly, as for some reason Windows will "break" the performance counter objects. In most cases, a reboot of the system resolves this issue.

Based in previous issues in this regard, it mostly happens if software is being installed/uninstalled which will invoke in one way or another with the performance counter library itself to add or remove content there.

Is by any chance the SCOM-Agent installed on this machines affected?

This is a known place to cause issues: IWKB000016

LordHepipud avatar Jan 21 '25 16:01 LordHepipud

I have seen a reboot regularly fix the problem, albeit sometimes for less than 10 minutes. I'm hoping this isn't the prescribed fix, we want to migrate entirely from our old Nagios using check_nrpe to Icinga, using the agent.

No SCOM-Agents installed. Is there something I can look for in other software that may be causing the same problems as the SCOM-Agent does? I think the closest we get is Intune (some devices, not all), NinjaRMM, and BitDefender Gravity Zone.

Test-IcingaInterceptCounter cmdlet on an affected host:

PS C:\Windows\system32> Test-IcingaInterceptCounter
[Notice]: Testing for Microsoft SCOM Intercept Counters
[Passed]: Entry "HKLM:\SYSTEM\CurrentControlSet\Services\Intercept CSM Filters\Performance" is not present on the system
[Passed]: Entry "HKLM:\SYSTEM\CurrentControlSet\Services\Intercept Injector\Performance" is not present on the system
[Passed]: Entry "HKLM:\SYSTEM\CurrentControlSet\Services\Intercept SyncAction Processing\Performance" is not present on the system
[Passed]: Entry "HKLM:\SYSTEM\CurrentControlSet\Services\InterceptCountersManager\Performance" is not present on the system
[Passed]: Entry "HKLM:\SYSTEM\CurrentControlSet\Services\InterceptCountersManager\Performance" is not present on the system
[Passed]: Entry "HKLM:\SYSTEM\CurrentControlSet\Services\Backup Exec\Performance" is not present on the system
[Passed]: There are either no intercept counters installed on your system or they are disabled. Monitoring of Performance Counters should work fine

It's a really frustrating and peculiar issue. The only counter category I have ever seen affected is "Processor Information." I have seen it recover on its own occasionally. I'll try to monitor this more closely and see if I notice anything.

On the one host I have access to right now, the Event Viewer shows:

The Open procedure for service ".NETFramework" in DLL "C:\Windows\system32\mscoree.dll" failed with error code Access is denied.. Performance data for this service will not be available.

That led me to IWKB000008 which gives me a little hope in that enabling the Internal API Forwarder will resolve the problem from a monitoring standpoint. I have already tried the JEA Profile without success. I'll try that when I get the chance over the next couple days and update.

WilHatesComputers avatar Jan 22 '25 00:01 WilHatesComputers

@WilHatesComputers @LordHepipud From time to time we see the same issues on some Win-Servers. Workaround: simple restarting both local icinga*-services - first "icingapowershell", then "icinga2". Without further debugging it looks like it is provoked by a temporary high CPU-usage or similar lack of os-ressources (only a guess).

tectumopticum avatar Jan 27 '25 13:01 tectumopticum

Does this error still occur with v1.13? We made a small fix for fetching Performance Counters, which might help resolve this issues.

LordHepipud avatar Apr 22 '25 13:04 LordHepipud

I'm updating an affected device right now. I leave for vacation tomorrow and will return on May 1. I'll try to take a peek at the history on the host during that time and I'll bump this with the results.

Xility-Wil avatar Apr 22 '25 19:04 Xility-Wil

Thank you very much, enjoy your vacation!

LordHepipud avatar Apr 23 '25 13:04 LordHepipud

Hello, we currently have the same problem with this check. We are using version 1.13.3 for the powershell framework and we found that the problem get's fixed temporarily thorugh opening perfmon under windows and after maybe 2-3 minutes the service works again. We can't really find a trigger for the problem as it mostly happens on some hosts after installing or reinstalling the agent and sometimes just randomly.

MoniDevAlex avatar Aug 29 '25 11:08 MoniDevAlex

Hi @LordHepipud,

Not excactly sure if that's case for the opening of this thread. I had the same issue for the last month. Especially for some Citrix Management host the CPU Counter were pretty often not available. Mostly when the agent was reinstalled due to a pipeline run which always triggers that, to keep the environment up to date.

I had the opportunity to troubleshoot one of those servers and I narrowed it down, that I could provoke this behaviour by simply restarting the Icinga Powerhsell Service. In this case it can solve the issue of not being able to get the processor information counter Category, or to provoke it.

When debugging I noticed, that I had an Error in the Event Viewer right after restarting the service, that did not appear in case the Check works fine:

Failed to query Icinga check over internal REST-Api check handler

A service check could not be executed by using the internal REST-Api check handler. The check either ran into a timeout or could not be processed. Maybe the check was not registered to be allowed for being executed. Further details can be found below.

Icinga for Windows exception report:

Exception Message: Unable to connect to the remote server

Invocation Name: Invoke-WebRequest

Command Origin: Internal

Script Line Number: 3233

Exact Position: At C:\Program Files\WindowsPowerShell\Modules\icinga-powershell-framework\cache\framework_cache.psm1:3233 char:22 ... ApiResult = Invoke-WebRequest -Method POST -UseBasicParsing -Uri ([st ... ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

StackTrace: at System.Net.HttpWebRequest.GetRequestStream(TransportContext& context) at System.Net.HttpWebRequest.GetRequestStream() at Microsoft.PowerShell.Commands.WebRequestPSCmdlet.SetRequestContent(WebRequest request, Byte[] content) at Microsoft.PowerShell.Commands.WebRequestPSCmdlet.ProcessRecord() at System.Management.Automation.CommandProcessor.ProcessRecord()

Call Stack:

Command Arguments


Get-IcingaExceptionString {ExceptionObject=Unable to connect to the remote server}
Write-IcingaEventMessage {Namespace=Framework, EventId=1553, ExceptionObject=Unable to connect to the remote... Invoke-IcingaInternalServiceCall {Command=Invoke-IcingaCheckNLA, Arguments=System.Collections.Hashtable}
Exit-IcingaExecutePlugin {Command=Invoke-IcingaCheckNLA, -Profile, DomainAuthenticated, -Verbosity, 2, -NICs... <ScriptBlock> {}

Object details: Invoke-IcingaCheckNLA

As Background Daemon we have the Rest API and the Check Daemon active. I read the error as a sort of racing condition for the startup, which I thought likely as it was kind of random if the issue appeared or not.

For that I added a Startup Delay for the Background Daemon in the Config of the Powershell Framework. I haven't seen the Event log since and seems somewhat reliable for now:

Register-IcingaBackgroundDaemon -Command 'Start-IcingaServiceCheckDaemon' -Arguments @{'-StartupDelay' = 15}; Register-IcingaServiceCheck -CheckCommand 'Invoke-IcingaCheckCpu' -Interval 300 -TimeIndexes 5m, 15m Register-IcingaBackgroundDaemon -Command 'Start-IcingaWindowsRESTApi' -Arguments @{ '-Port' = 5667; } Add-IcingaRESTApiCommand -Command 'Invoke-IcingaCheck*' -Endpoint 'apichecks'

"BackgroundDaemon": { "StartupDelay": 15, "EnabledDaemons": { "Start-IcingaServiceCheckDaemon": { "Command": "Start-IcingaServiceCheckDaemon", "Arguments": { "-StartupDelay": 15 } }, "Start-IcingaWindowsRESTApi": { "Command": "Start-IcingaWindowsRESTApi", "Arguments": { "-Port": 5667 } } }, "RegisteredServices": { "527*": { "CheckCommand": "Invoke-IcingaCheckCPU", "Arguments": null, "Interval": 300, "TimeIndexes": [ "5m", "15m" ] }, "169*": { "CheckCommand": "Invoke-IcingaCheckCpu", "Arguments": null, "Interval": 300, "TimeIndexes": [ "5m", "15m" ] } } },

bfenda avatar Nov 10 '25 13:11 bfenda