`apcsmart` lost communication with UPS results in intense syslog flood
Hi,
I get this issue second time, nutups lost communication with UPS (via USB/Serial cable) and nut tools and syslog start eating all 4 cores (cpu quickly reach temperature 78C ), it produce huge log file (my poor SD card...) and it produce about 4500lines in log per second ! Entries in log looks:
Jun 17 22:00:08 iotgwpc2 apcsmart[1285]: Warning: excessive comm failures, limiting error reporting
Jun 17 22:00:08 iotgwpc2 apcsmart[1285]: Communications with UPS lost: serial port write error: 1379(smartmode): Input/output error
Jun 17 22:00:08 iotgwpc2 apcsmart[1285]: message repeated 9 times: [ Communications with UPS lost: serial port write error: 1379(smartmode): Input/output error]
That USB/serial is only temporary solution, later UPS will be connected directly to onboard UART but this is insane amount of error messages and rate. Is this a bug or there is a option to limit this error messages ?
Orange PI PC2 - Armbian 4.19.38-sunxi64 #5.86 SMP
Network UPS Tools - UPS driver controller 2.7.4
/Tomi
Hitting the same behaviour now, any progress on this?
I am not aware of anyone addressing this specifically, so probably fair to say it is a bug, and probably it is still present. Tested PRs for throttling the message emission (maybe slower backoff to retry connecting?) are welcome.
Just had the same happen to me on a Raspberry Pi. Filled my 250GB SSD which subsequently made the Home Assistant database get corrupted. No way of catching it that quickly since it happened while I was sleeping. I'm not happy about this at all.
Any solution or workaround to this? I've just disabled nut for now.
Was that also with apcsmart driver? Probably a solution in NUT could be to throttle it sending the error message (or add a config toggle for that effect - e.g. send disconnect infos once at all, or once every N minutes).
With HA involved, the practical solution would also depend on getting modern NUT running there instead of the older package (see wiki for contributed article about custom-building a container).
Another vector could be to configure your syslog daemon log rotation and/or throttling of same messages (would help storage at least, if not cpu stress).
Finally, try to figure out the nature of disconnects and how to cause a reconnect or driver restart - PRs welcome. This would be an actual fix :)
Jim
On Tue, May 2, 2023, 13:53 Oli Cooper @.***> wrote:
Just had the same happen to me on a Raspberry Pi. Filled my 250GB SSD which subsequently made the Home Assistant database get corrupted. No way of catching it that quickly since it happened while I was sleeping. I'm not happy about this at all.
Any solution or workaround to this? I've just disabled nut for now.
— Reply to this email directly, view it on GitHub https://github.com/networkupstools/nut/issues/704#issuecomment-1531337019, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMPTFEFZ7ZG3NKD34H3TITXEDYSNANCNFSM4HY2BTEQ . You are receiving this because you commented.Message ID: @.***>