nutdrv_qx: support replies not terminated by the required CR
Apparently some devices don't close their replies to our commands/queries with the expected (and mandated by the specs) CR and, at the moment, this is not supported by nutdrv_qx: http://lists.alioth.debian.org/pipermail/nut-upsuser/2017-June/010714.html ('krauler' USB subdriver)
As far as I can remember, this eventuality can only happen with certain USB subdrivers that only read a predetermined amount of data, read data from a given index or wait till no more data is available (if I recall correctly, serial communication and other USB subdrivers need the terminating CR in order to work properly), so, probably, the easiest fix should be to handle this situation in those USB subdrivers.
Also, this should not be considered an improvement, but a regression that needs to be solved, since blazer_* drivers, being less strict on the terminating CR of Q1 replies (3747fd5), should (sort of) support (some of) this kind of devices.
MAINTAINER NOTE: The null byte vs. \r at end of reads sounds reminiscent of work recently done by @blaa for the nutdrv_qx subdriver for armac in #2005; maybe this part should/can be generalized somehow to help with all devices that do not use CR/LF in protocol? It seems that the 1595a06501daa93e06035a861e3db7ccab2871dd commit which allegedly solved this issue previously added the fix for certain *_command() methods, and so is lacking in some others that appeared afterwards.
@zykh is it okay if I move this to the 2.7.6 milestone?
ahem, yes, @clepple... it turns out I completely forgot this one...
no problem. (Hopefully the next release won't be a year from now :-) )
moving back to 2.7.5, since a) there is a workaround (https://github.com/zykh/nut/commit/1595a06501daa93e06035a861e3db7ccab2871dd) that apparently works as expected, and b) this is needed for #616
For the record,
I encountered this bug in nut 2.7.4 on Debian 11 with my UPS, model "UPSONIC IRT-3K 2U". My UPS is about 10 years old. The equivalent current model appears to be https://www.upsonic.com.au/upsonic-esart-rack-or-tower-series-ups/
https://bugs.debian.org/1019624
The fix in nut 2.8.0 (https://github.com/zykh/nut/commit/0e2372bc4a6b1046b56d28220d2d931f148b9712) looks much nicer than mine. I am confident it will fix my problem, but as my UPS is back in production, I won't be able to verify this experimentally for a long time. :-(
Here's my upsc output (inc. make/model):
cyber@light:~$ upsc upset
Init SSL without certificate database
battery.charge: 100
battery.voltage: 2.29
battery.voltage.high: 78.00
battery.voltage.low: 62.40
battery.voltage.nominal: 72.0
device.mfr: UPSONIC
device.model: IRT-3K 2U
device.type: ups
driver.name: nutdrv_qx_cyber
driver.parameter.pollfreq: 30
driver.parameter.pollinterval: 2
driver.parameter.port: auto
driver.parameter.synchronous: no
driver.version: 2.7.4
driver.version.data: Upsonic 0.06
driver.version.internal: 0.28
input.current.nominal: 14.0
input.frequency: 49.9
input.frequency.nominal: 50
input.voltage: 229.0
input.voltage.fault: 120.0
input.voltage.nominal: 240
output.voltage: 240.0
ups.beeper.status: enabled
ups.delay.shutdown: 30
ups.delay.start: 180
ups.firmware: MP001155
ups.load: 49
ups.productid: 0000
ups.status: OL
ups.temperature: 20.0
ups.type: online
ups.vendorid: ffff
cyber@light:~$ dpkg-query -W 'nut*'
nut-cgi
nut-client 2.7.4-13
nut-ipmi
nut-monitor
nut-server 2.7.4-13
nut-snmp
nut-xml
As noted in the "Maintainer note" above, the solution for NUT v2.8.0 release was pin-point to certain protocol dialects. Apparently the problem is more wide-spread, so finding some way to apply the solution commonly (including to future driver contributions) would be welcome. At the very least, the several copies of same fix proposed in the PR which solved it previously seems like an invitation for refactoring into a helper method to invoke from everywhere, perhaps?..
BTW, in Armac, the final solution was different. I was forging the \r for a second when finding \0 in the datastream prematurely for Vultech. But in the end, after coercing \0 to '0' the UPS eventually sends \r at the right position and everything seems to work "fine". Just single bit is sent as \0.
This \0 is weird obviously. When finishing on it though, there was not enough status bits to correctly parse response, although those lacking weren't particularly important and could be synthesized as zeroes.
Note for future, regarding Armac subdriver approach, see:
- #2005
- #2154
I am not convinced that devices covered by armac_command() are the only ones in need of such approach. There were a number of reports (couldn't find them quickly at the moment, but I think they regarded "short read" errors?) which I think could have the same root cause with NUL bytes instead of ASCII zeroes in a bitmap.
If so - then possibly this solution should be generalized to all QX reads?..