Epyc Ion 1000VA issues with some parameters
I just bought the UPS in title so I have just completed the NUT installation/configuration in Debian trixie. This UPS uses the CPS usbhid-ups driver but in NUT 2.8.1 (Debian package) the decoding of some parameters is questionable. I have just started having a look at the documentation, sources, etc, so I'm not still up to speed with the eventual work needed to propose fixes. In the mean time I have a question: given that this is a "generic" firmware which is decoded by a equally "generic" driver, I was thinking that maybe writing a specific driver for this model would be a good idea. The only problem I see is that I have not found a way to tell this model from other brands models:
ups.mfr: 1 ups.model: 1000 ups.productid: 0601 ups.serial: ups.vendorid: 0764
Is there a way, that maybe I am missing, to tell this brand/model from others?
output.voltage decoding is wrong:
0.056146 [D4] Entering libusb_get_report 0.058121 [D3] Report[get]: (3 bytes) => 12 a6 08 0.058131 [D5] PhyMax = 0, PhyMin = 0, LogMax = 255, LogMin = 0 0.058136 [D5] Unit = 00f0d121, UnitExp = 6 0.058139 [D5] Exponent = -1 0.058143 [D5] hid_lookup_path: 00840004 -> UPS 0.058146 [D5] hid_lookup_path: 0084001c -> Output 0.058149 [D5] hid_lookup_path: 00840030 -> Voltage 0.058154 [D1] Path: UPS.Output.Voltage, Type: Feature, ReportID: 0x12, Offset: 0, Size: 16, Value: 16.6
it's clear that the value 08a6 decodes in 2214 which is ten times the voltage which is correctly 221.4. In fact the input.voltage has the same value but this is decoded properly:
0.048149 [D4] Entering libusb_get_report 0.050120 [D3] Report[get]: (3 bytes) => 0f a6 08 0.050131 [D5] PhyMax = 0, PhyMin = 0, LogMax = 65535, LogMin = 0 0.050134 [D5] Unit = 00f0d121, UnitExp = 6 0.050138 [D5] Exponent = -1 0.050142 [D5] hid_lookup_path: 00840004 -> UPS 0.050145 [D5] hid_lookup_path: 0084001a -> Input 0.050148 [D5] hid_lookup_path: 00840030 -> Voltage 0.050153 [D1] Path: UPS.Input.Voltage, Type: Feature, ReportID: 0x0f, Offset: 0, Size: 16, Value: 221.4
Another problem I have is with all the timing values which are decoded to -60 (negative?):
0.064127 [D4] Entering libusb_get_report 0.066122 [D3] Report[get]: (3 bytes) => 16 ff ff 0.066132 [D5] PhyMax = 1966020, PhyMin = -60, LogMax = 32767, LogMin = -1 0.066136 [D5] Unit = 00001001, UnitExp = 0 0.066140 [D5] Exponent = 0 0.066144 [D5] hid_lookup_path: 00840004 -> UPS 0.066147 [D5] hid_lookup_path: 0084001c -> Output 0.066150 [D5] hid_lookup_path: 00840056 -> DelayBeforeStartup 0.066155 [D1] Path: UPS.Output.DelayBeforeStartup, Type: Feature, ReportID: 0x16, Offset: 0, Size: 16, Value: -60
The value 0xffff, I'm ready to bet that it means "infinite" or disabled. Is there a reason for which it is decoded to -60?
There are other vendor-specific parameters which are ignored: do we have any information about them or they are unknown parameters? For example:
0.086148 [D5] hid_lookup_path: 00840004 -> UPS 0.086152 [D5] hid_lookup_path: ff01001d -> not found in lookup table 0.086155 [D5] hid_lookup_path: ff01001a -> not found in lookup table 0.086159 [D5] hid_lookup_path: ff010002 -> not found in lookup table 0.086163 [D1] Path: UPS.ff01001d.ff01001a.ff010002, Type: Feature, ReportID: 0x27, Offset: 0, Size: 8, Value: 1
Additionally I'm getting some errors for which I don't understand the source, can someone shed some light on what do they mean?
Overflows:
0.086166 [D4] Entering libusb_get_report 0.087226 [D2] nut_libusb_get_report: Overflow 0.087235 [D1] Can't retrieve Report 28: Resource temporarily unavailable 0.087239 [D5] hid_lookup_path: 00840004 -> UPS 0.087242 [D5] hid_lookup_path: ff01001d -> not found in lookup table 0.087246 [D5] hid_lookup_path: ff01001b -> not found in lookup table 0.087249 [D5] hid_lookup_path: ff010040 -> not found in lookup table 0.087252 [D1] Path: UPS.ff01001d.ff01001b.ff010040, Type: Feature, ReportID: 0x28, Offset: 0, Size: 8
File report buffer error:
0.105818 [D5] send_to_all: SETINFO driver.state "init.updateinfo" 0.105821 [D1] upsdrv_updateinfo... 0.112363 [D2] file_report_buffer: expected 2 bytes, but got 512 instead 0.112375 [D3] Report[err]: (512 bytes) => 0b 13 71 fa a0 5c 00 00 00 00 00 00 00 00 00 00 ...
Thanks for your time reading this long post.
output.voltage and input.voltage differ in at least LogMax = 255 vs LogMax = 65535 because someone does not know their way around the USB spec. Many vendors don't, even big names. So what remains with proper decoding is 0xA6 = 166, divided by ten...
With recent NUT versions, usbhid-ups supports "fix-up" methods (quite populated for CPS and APC HID subdrivers) which tell it to turn a blind eye to some such mismatches where we claim to know better. Your ID above matches drivers/cps-hid.c:#define CPS_VENDORID 0x0764 so little surprise here.
As a first step, check with https://github.com/networkupstools/nut/wiki/Building-NUT-for-in%E2%80%90place-upgrades-or-non%E2%80%90disruptive-tests to build current NUT master branch. Maybe the problem is already solved there. Even if not, to tinker about fixing the subdriver you would need to set up build-ability anyway.
Wanted to follow-up to the output.voltage problem. I have backported the release-2.8.3 from debian test to trixie and this version solves the output.voltage issue:
libnutscan3/now 2.8.3-3~bpo13+1 amd64 [installed,local]
libupsclient7/now 2.8.3-3~bpo13+1 amd64 [installed,local]
nut-client/now 2.8.3-3~bpo13+1 amd64 [installed,local]
nut-server/now 2.8.3-3~bpo13+1 amd64 `[installed,local]
output.voltage: 221.7
I saw also that the timing decoding doesn't seem to be correct also in 2.8.3:
0.066122 [D3] Report[get]: (3 bytes) => 16 ff ff 0.066132 [D5] PhyMax = 1966020, PhyMin = -60, LogMax = 32767, LogMin = -1 0.066136 [D5] Unit = 00001001, UnitExp = 0
Unit = 00001001 means seconds so the report reads -1 but there is some conversion and it becomes -60. The UnitExp=0 says there should be no conversion between Log and Phy (multiply by 10^0=1) so in this case Log and Phy should be the same and the readback from the ups should be -1 not -60. It looks like the code thinks that the ups reports these times in minutes and needed to be converted to seconds in nut, but -1 is a special value that should be preserved from this conversion.
Good idea, there was something about CPS counting time in minutes, maybe that was taken care of (x60) in those lower level methods somewhat blindly?.. Worth checking.
What would that -1 mean though, "disabled"?
I would say -1 means disabled. I have checked the power master+ software and the start/shutdown timers are disabled in my setup. I have tried hard to find where and why there's this multiplication x60 in nut code but I couldn't find: ups.timer.shutdown: -60 ups.timer.start: -60
I didn't try to set a timer with power master+ to double check (the ups is powering my homelab) but the timer in the software is actually a hour:minute. I don't know if this means seconds are not considered or they do, surely there's not enough seconds in 32768 to cover a daily shutdown-start cycle (here I would also say that in my view the usbhid-ups protocol shows a clear limitation). There's also a day set if the timer is not daily, but I think the day is handled by the software. I would say that if the ups timer is in minutes it would make a lot of sense to me.