nut-scanner should also check LD_LIBRARY_PATH for runtime dependencies
It looks like nut-scanner checks a few hard-coded directories--/lib64, /usr/lib64, etc--at runtime for the presence of libusb-0.1.so/etc and then gives up if those libraries are missing. This is a problem on NixOS where there are no global library paths; packages either patch the loader to look in the correct paths for load-time dependencies, or override environment variables like LD_LIBRARY_PATH for run-time dependencies.
Playing around with things a little bit, if I just bypass the directory check by changing:
nutscan_avail_usb = nutscan_load_usb_library(get_libname("libusb-0.1.so"));
to
nutscan_avail_usb = nutscan_load_usb_library("libusb-0.1.so");
in nutscan-init.c, then the application will attempt to lt_dlopen("libusb-0.1.so") and libtool will automatically find the correct library given a correct LD_LIBRARY_PATH context. I'm actually not sure why the get_libname() function is used; the code is already robust enough to fail gracefully if lt_dlopen() fails.
I initially raised issue #91730 with Nixpkgs, but they suggested I also raise an issue upstream since it does look like nut-scanner is re-implementing a dynamic loader in some sense. Please let me know if there's something I misunderstood. Thanks.
Interesting point, and good to keep such nuances in mind that NUT gotta be a very portable solution so breakage like this is certainly not good.
That said, I believe LD_LIBRARY_PATH and friends were always documented as a hack which fools the system linker and should not be used in production, either - even if it can be seen everywhere ;)
Also, if I read the code quotes in your issue linked above, the problem is with building some antique version of NUT (might be the latest official release, but... quite an aged state of codebase; we're working on it).
Looking at current https://github.com/networkupstools/nut/blame/772c98dec88a3037f19a564b7b7efb96b43e1a72/common/common.c#L693 sources I see that it moved from nut-scanner sources to common sources 4 years ago. Earlier it was where you pointed, as seen at https://github.com/networkupstools/nut/blame/53fa5188077697356f6b81a67e1bdb15ff1efb2e/tools/nut-scanner/nutscan-init.c
I believe the original premise for this routine and array was that per issue #233 there is a problem loading libraries by their *.so (unversioned) filename since such links are not part of binary-only packages. Usually code is linked against libs during build, however for historical practical reasons, manual dlopen() etc. was needed to ensure that we can build and distribute nut-scanner once and run it regardless of someone not having USB or SNMP libs on their system - nut-scanner detects these at run-time. Maybe poorly (versions in filenames that were up-to-date a few years ago would concern me today; at least better defaults could be substituted during a packaging build, or guessed by patterns), but happened due to a necessity.
Remember that NUT supports lots of drivers (and libraries those drivers use) to connect to different sorts of power-related devices, and there is no reason to bloat end-users' systems (sometimes embedded) with connectors to hardware they would never see. While we can package these or those drivers so users of a distro can optionally install them or not, we can't really conveniently package a matrix of builds of nut-scanner that talks as many protocols as it can.
Even from the 5-years old version at https://github.com/networkupstools/nut/commit/75c6e065574dfea99c2221664e1778252b69e902 I see that it introduced and used the LIBDIR determined at configure time to point at the location of system libraries, as the expected location of packaged dependencies, and only falls back to hardcoded strings (in current version - also depending on bitness of the build to try to avoid 32-vs-64 binary mismatches) if there was nothing in the system library collection.
@lodi : Can you please check if that configure option resolves this for you, by the way? e.g.
./configure --libdir=/run/current-system/sw/lib ...
I have tried building nut with --libdir=/run/current-system/sw/lib but it fails at build time:
{
nixpkgs.overlays = [
(final: prev: {
nut = prev.nut.overrideAttrs (_: {
configureFlags = [
"--with-all"
"--with-ssl"
"--without-snmp" # Until we have it ...
"--without-powerman" # Until we have it ...
"--without-cgi"
"--without-hal"
"--with-systemdsystemunitdir=$(out)/etc/systemd/system"
"--with-udev-dir=$(out)/etc/udev"
"--libdir=/run/current-system/sw/lib"
];
});
})
];
}
Making install in clients
make[1]: Entering directory '/build/nut-2.7.4/clients'
make[2]: Entering directory '/build/nut-2.7.4/clients'
/nix/store/wzz4ivm826z2m5m6bhpnim19q6163px0-coreutils-9.0/bin/mkdir -p '/run/current-system/sw/lib'
/nix/store/wzz4ivm826z2m5m6bhpnim19q6163px0-coreutils-9.0/bin/mkdir: cannot create directory '/run': Permission denied
Am I correct in understanding that nut maintainers refuse to support LD_LIBRARY_PATH?
Not so much "refusing", at least in my case just did not get to this issue among many others solved in the recent years. That said, coming from a dominantly Solarish background, I was taught that while LD_LIBRARY_PATH is often a useful hack, it is frowned upon as a way to shoot oneself in the foot generally, and every few years discussions appear about eradicating that mechanism in favor of built-in RPATH etc.
Among NUT spin-offs, there was a bundled product delivering a whole binary tree including libc.so - and while that worked for some target OS releases, in others it exploded (e.g. cp in the installer script refused to run, because system cp wanted a much newer GLIBC_VER than got linked in).
If you however have a whole OS built around this concept and can juggle it safely, I don't see a big problem with that. Likewise in Solarish systems, there are tools to set system resolver paths as part of OS config rather than env vars; I suppose that does a more sanity-checked job too somehow. So at the very least, trying the system-resolved path at the end of stack (or prioritized with a new configure option and ifdef's) would be least-surprise for existing users and packages on other systems and would help you.
IIRC there may be another Achilles Foot here with some code expecting certain ABIs (thinking of libneon-2.7) that may be enforced in filenames, and some decade those files won't be resolvable and/or ABIs would drift and segfaults would arise with new library byte layouts. With the generic run-time loading of the external library (vs. pre-linking) to avoid requiring to install-world for smaller deployments who only care for e.g. libusb and not other probes, this is also something that IMHO needs a robust solution invented. Maybe more module layers so they fail in the worst case, or are pre-linked and disseminated among NUT packages and keep NUT runtime-linking (or pipe-talking) ABI/API stable... considering shared vars like debug level or thread count for throttling.
Regarding the build failure posted above, seems like you ran make install as a non-root user who failed to create /run for the tree under that (and/or rootfs is read-only)? You can pass make DESTDIR=... install to prepare an alternate proto directory in a place this account may write.
@jakubgs /run/current-system isn't available in the Nix sandbox (which is why the makefile fails to create directories there).
In general, I'd discourage from poking around in that place at runtime. In nixpkgs, we usually fix absolute nix-store paths in RUNPATH, and loading libraries happens without a path in the dlopen() call, but just the library name.
I'd propose trying to address nixpkgs packaging by patching this downstream in nixpkgs for now.
Well, PRing a number of NUT fallback dlopen calls without a path instead
of outright failing is also quite an option, if that helps your packaging
become more simple and others' attempts more robust :)
On Sun, Feb 13, 2022, 14:43 Florian Klink @.***> wrote:
@jakubgs https://github.com/jakubgs /run/current-system isn't available in the Nix sandbox (which is why the makefile fails to create directories there).
In general, I'd discourage from poking around in that place at runtime. In nixpkgs, we usually fix absolute nix-store paths in RUNPATH, and loading libraries happens without a path in the dlopen() call, but just the library name.
I'd propose trying to address nixpkgs packaging by patching this downstream in nixpkgs for now.
— Reply to this email directly, view it on GitHub https://github.com/networkupstools/nut/issues/805#issuecomment-1038127323, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMPTFDCT4AWA4462YBC4T3U26YQXANCNFSM4OY776BQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you commented.Message ID: @.***>
@flokli yes, you're right, I was more illustrating the point that it doesn't work like that.
And @jimklimov is also probably right, that the correct solution for nixpkgs is to patch the search_paths list in common/common.c at build time to make these libraries available.
Thinking of reasonable fallbacks, before path-less dlopen, maybe even before what is done now - one more kludge can be to consider the (possibly numerous and duplicate, among both themselves and already checked system paths) LIBDIRs of dependencies involved (where were libusb, libsnmp, libneon... at time of build?).
That way, if your FS layout for builds reflects the run-time layout, how-ever "unconventional" compared to other OSes, the locations "correct" for each build system would be checked.
On Mon, Feb 14, 2022, 09:41 Jakub @.***> wrote:
@flokli https://github.com/flokli yes, you're right, I was more illustrating the point that it doesn't work like that.
And @jimklimov https://github.com/jimklimov is also probably right, that the correct solution for nixpkgs is to patch the search_paths list in common/common.c https://github.com/networkupstools/nut/blob/master/common/common.c at build time to make these libraries available.
— Reply to this email directly, view it on GitHub https://github.com/networkupstools/nut/issues/805#issuecomment-1038800541, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMPTFEGE77362HKRPQZ4VDU3C5G7ANCNFSM4OY776BQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you were mentioned.Message ID: @.***>
Note: with #1548 I tried to extend the automatic discovery of optional libraries by looking into LD_LIBRARY_PATH{,_32,_64} first. It seems to not break CI integration tests for default case (when the envvar is not set and the usual system paths suffice), so I intend to merge it to master soon.
For nut-scanner in particular, also fall back to try lt_dlopen() of library basename without NUT preferred path resolver, for systems that might have other ways to succeed natively.
I suppose the solution with newly added get_libname_in_pathset() can be extended to add a configure option to set distro-preferred library search paths (NULL by default) so allowing packagers to bolt down their preference. But keep in mind this concerns ltdl loading of shared objects and would not help resolve "persistently" linked dynamic shared objects if those are in some custom location (that's what LD_LIBRARY_PATH may be useful for).
@jakubgs @lodi @flokli : if you can check whether that change helps fix your system's builds and run-time in a comfortable fashion, that would be great :)
Gentle ping - did you get around to testing it yet? :)
I'm not using nut-scanner, but I asked in https://github.com/NixOS/nixpkgs/issues/91730#issuecomment-1217545432 (downstream issue) for feedback.
I have no NixOS device attached to my UPS and it's in a remote location. I can possibly try this this weekend or the next one.
Thanks a lot, people! :)
I tried to built the https://github.com/networkupstools/nut/commit/90093d2817fa753df4f7520808c22d533b5a3029 version via Nixpkgs but it failed:
clients/Makefile.am: installing './depcomp'
configure.ac:3322: error: required file 'scripts/augeas/nutupsconf.aug.in' not found
configure.ac:3322: error: required file 'scripts/devd/nut-usb.conf.in' not found
configure.ac:3226: error: required file 'scripts/systemd/nut-common.tmpfiles.in' not found
configure.ac:3322: error: required file 'scripts/udev/nut-usbups.rules.in' not found
parallel-tests: installing './test-driver'
autoreconf: error: automake failed with exit status: 1
Thanks! Did you ./autogen.sh first? There is an odd problem with autotools that templates which a configure script would process must exist before it is executed - the script itself may not generate them.
Tarballs (via make dist) pre-generate and archive these files to be usable out of the box, but development code has its own workflow and mostly avoids tracking build artifacts in Git (or does so for a layer of sanity-checking that certain changes were intentional).
I see, thanks for explaining, I'll look into it sometime this week.
@jimklimov, I can confirm that fbf3924f6 is working for me now, thanks. I'll work with the NixOS side to get this version packaged.
Super! Thanks for the news!