machinekit-hal RFC: getting rid of multiple rtap_app binaries, enable alternatives when loading components

Issue by mhaberler Sat Feb 20 18:18:21 2016 Originally opened as https://github.com/machinekit/machinekit/issues/874

[a bit longish but fundamental change - I'd appreciate a look over the shoulder while at it]

this lays out a proposed change to solve two issues at once:

getting rid of the per-flavor rtapi_app_<flavor> binaries (helps simplifiying the build and distribution, including the cmake build)
enable loading of components from different directories specified by a path preference (needed for the multicore branch)

(1) is an annoyance due to the way shared libraries are resolved by dlopen() in setuid binaries (rtapi_app needs to be setuid so it can talk to hardware); see man dlopen for the gory details.

The current solution @zultron came up with is:

The component shared libs (.so files) live under rtlib/<flavor>/<compname>.so, and to force loading from a particular rtlib/<flavor>/ the per-thread rtapi_app binary gets an rpath set during build - this means that once you start rtapi_app_<flavor>, this binary will load shared libs with dlopen only from its rpath (rtlib/<flavor>/).

While this works great, the downside is: we need multiple rtapi_app binaries - one per flavor - and they really only differ in the rpath attribute. Also, the scripts/realtime code needs to figure the flavor and start the right binary. Not much of an issue though.

(2) The harder issue is the introduction of a versioned HAL API in the multicore branch. To make a component multicore-safe, it must employ the new HAL pin API, and accessor functions; manipulating raw memory locations through pointers as done in the old API is impossible to make thread-safe in a sane manner. However, at the pin/signal level the API's are compatible. So a v1 pin and a v2 pin can be linked to a signal, and things will work as before.

@ArcEye has adapted comp to icomp to employ the v2 HAL API, and rewritten most of the components to use it, and I have converted some of the C comps to v2 (stepgen, pwmgen, encoder, sim_encoder).

This means we now have two sets of components: the existing set of legacy components, and all the rewritten v2 components. This bears the question how these are distinguished and loaded. As per above rpath scheme, all components need to reside in the same trusted directory. What @ArcEye and me have done so far is to give these comps different basenames; so for example there exists a rtlib/posix/xor.so and rtlib/posix/xorv2.so component.

This works, but means that ALL halcmd/Python HAL configs which should be converted to v2 need to be rewritten as the comp name changes. It also makes regression testing v2 against v1 comps harder as this is a somewhat ad-doc name change.

A better solution would be to provide a preference path of directories to load components from, like for example rtlib/posix/v2:rtlib/posix, and have the comp loading code in rtapi_app do the right thing depending on this path preference. This avoids wholesale rewrite of HAL configs as the preference path could be passed by some other means (halcmd primitive, API parameter, environment variable etc).

However, this is at odds with the security intentions of the rpath feature: directories of an rpath on a setuid binary are trusted. And the rpath resolution ONLY works if there is no slash in the first argument to dlopen().

To cite the manpage:

If filename contains a slash ("/"), then it is interpreted as a (relative or absolute) pathname. Otherwise, the dynamic linker searches for the library as follows (see ld.so(8) for further details): ... details of loading, including rpath feature..

So as soon as a slash is found in the dlopen() argument the rpath feature is essentially disabled. This is a problem if we want to distinguish shared libaries aka versioned components by pathnames like v2/comp.so etc.

what does work - and that could be the start towards a solution to both issues - is:

provide a way for rtapi_app to learn the path to the rtlib/<flavor> (eg from config values, like EMC2_RTLIB_BASE_DIR which points to rtlib/ - since rtapi_app knows the current flavor, it can construct the path to rtlib/<flavor> from EMC2_RTLIB_BASE_DIR
before loading a shared library, do a chdir(rtlib/<flavor>)
ALWAYS use a relative path in the dlopen argument, for instance dlopen(./xor.so) or dlopen(./v2/xor.so)
post loading, pop back to original working directory (so among others, an accidential core file is dropped in the right directory)

the gist of the idea is - the preference path is just a subdir under a trusted directory, and hence trusted as well.

Now if we also set the rpath to EMC2_RTLIB_BASE_DIR rather than to rtlib/<flavor>, then we also do not need separate binaries because there will be only one rtapi_app with an rpath which applies to all flavors.

From what I've tried this works. I would appreciate feedback on the idea before I do this in full.

[edit: the way dlopen() works, it seems this scheme avoids the rpath functionality as it does not apply altogether to dlopen() loading from relative paths. This suggests the rpath to EMC2_RTLIB_BASE_DIR can be dropped as long as EMC2_RTLIB_BASE_DIR is available some way to chdir to. ]

Aug 03 '18 15:08 ArcEye

Comment by ArcEye Sun Feb 21 11:21:06 2016

I think you are quite right, but I would actually go further.

The introduction of the multiple kernel builds, build flavour etc was an innovation to Linuxcnc ( one it ultimately was not ready for :disappointed: ) as part of the support for other rt kernels and platforms. It then defined the difference between Linuxcnc and Machinekit in terms of platform support.

Now that we are splitting the project into discreet, albeit interconnected blocks, a lot of other things have changed and it might be time to consider what we actually need.

rtai has gone and next to no-one actually uses posix as a platform, there is simply no need, a simulation will run on a rt kernel too plus there is no latency issue with a simulation and rt-preempt will run on just about everything.

This leaves rt-preempt and xenomai, in the future possibly just rt-preempt.

I would greatly favour a build system where either the running kernel is detected and built for, or it is specified at configuration. No posix builds because they are irrelevant and if you want it to run on a kernel, you build whilst running that kernel. The build is thereafter marked as that flavour and out of tree builds etc can be managed simply by reference to that tag.

That immediately simplifies a lot of the rpath stuff because there are no flavours, just what you built for.
If a default choice of component, with a legacy fallback, can be managed as you describe, that would be ideal.

The alternative of having different configurations etc. is a non starter for users, it needs to be backwardly compatible and unseen, in the same way that we made loadrt function in apparently the same way with instantiated components as it had previously with rt ones.

I look forward to getting the multicore work into HAL, and this split is probably the best time to do it. We have a clean sheet to some extent and don't need to be shackled by previous norms

Aug 03 '18 15:08 ArcEye

Comment by mhaberler Sun Feb 21 21:03:44 2016

I see your convenience angle

however I view the target environment we're optimizing for a bit different

First, I think we need to optimize for the common case, which are package installs. We - developers - might run from a source build, but that is a minority scenario.

the promise of the unified build was (and is): install package, boot any realtime (or not) kernel, application will run; I would not give that up easily as it is a major distinguishing factor

I dont agree posix is useless. It might be for machining but not for other automation tasks - and the machining-only segment of OSS is very small - just look at the number of active developers. So in terms of positioning machinekit, we should aim for wide applicability, rather than narrowing down for cnc machining only.

but maybe it's just a misunderstanding. Is this just about selecting a default flavor automatically? then maybe deriving a flavor list for the build is a matter of building and running the flavor binary which selects the default flavor, and then use that to build a flavor?

Aug 03 '18 15:08 ArcEye

Comment by ArcEye Mon Feb 22 08:50:24 2016

I was playing devils advocate to an extent, but there is a serious side to it.

I don't think that if I build on a rt-preempt kernel, that I should get a posix build too. It takes up disk space, wastes time and is of no use to me.

The target should either be selected, in which case multiple targets are possible or automatically detect the running kernel and just build for that. Probably the former.

Whatever method is chosen to indicate the correct path to the [flavour]/modules probably needs to be exported as an env variable too. That would simplify post build stuff, like out of tree builds of components say, easy to select the correct CFLAGS set then.

Aug 03 '18 15:08 ArcEye

Comment by zultron Mon Feb 22 14:09:39 2016

IMO modules should not be separated by flavor. In the past, I confirmed with cmp that e.g. rt-preempt/abs.so and xenomai/abs.so are identical binaries, despite being built separately. If that's true, it would be simple to rename rtapi.so module to rtapi_xenomai.so and rtapi_rt_preempt.so and load them from the same directory after detecting what realtime capabilities the environment provides.

Also, the RT_PREEMPT rtapi flavor should, IMO, be merged with the POSIX flavor. Already they're built from the same sources but with different gcc -DTHREAD_FLAVOR_ID=[01] args that control #ifdef sections. Instead, the RT_PREEMPT capabilities detected at run-time should be used to replace #ifdef with plain old if. For safety, rtapi_app should refuse to run non-RT threads without e.g. an explicit command-line switch permitting them.

As for the v2 and v1 comps question, I'm afraid I don't understand exactly how they are different. Why is the v1 comp kept around after it is ported to v2? Is it a user config issue, or something else? I have ideas to solve some possible scenarios, but I shouldn't speculate.

Aug 03 '18 15:08 ArcEye

Comment by ArcEye Mon Feb 22 15:07:40 2016

Hi @zultron

As for the v2 and v1 comps question, I'm afraid I don't understand exactly how they are different

The components are majorly different internally, the v2 components use pin_ptr's instead of pins and values are set and got, using accessor functions instead of dereferenced pointers based upon an offset from a fixed memory address. (existing single core x86 model) It is all to use 64 bit atomic functions so that the data is safe across different threads.

This instantly creates a backward compatibility problem with some systems and 3rd party programs which have no knowledge of the new structures, or may be i386, so I expect that is the driver for having both available in the short term.

@mhaberler will explain better than me.

I like your thoughts on flavours. That could radically simplify things.

It was always my understanding that there was not a huge deal of basic difference between posix and rt-preempt, save the different kernel scheduling capabilities, high resolution timer etc. Perhaps I was right for once ! :wink:

Aug 03 '18 15:08 ArcEye

Comment by zultron Mon Feb 22 21:20:05 2016

@ArcEye, that explanation helps a lot. You mention two problems, if I'm correct:

there was an API change, and keeping v1 comps around avoids breaking legacy software; and
there is possibly an issue with i386 (or possibly 32-bit architectures, incl. armv7?)

It sounds like both these problems can be addressed at build time. The second one is easy: if v2 comps don't work on i386, don't build v2 comps.

For folks that need legacy v1 comps, I would do it this way:

Put all v2 comps into rtlib, e.g. rtlib/xor.so
Also put all v1 comps with no v2 version into rtlib, e.g. rtlib/motion.so (IIRC motion is a v1 comp)
Put all v1 comps with v2 versions rtlib_legacy, e.g. rtlib_legacy/xor.so.

Now you have the rtlib directory with the newest versions of each comp, and the rtlib_legacy directory with compatibility versions. Most users will need only the modules in rtlib, and the unified rtapi_app will always be linked with the rtlib rpath. By default, the legacy comps won't be built or installed, and legacy comps will be moved to a separate package.

Now there are two options:

Link rtapi_app with two rpath directories: rtlib_legacy first, and rtlib second. By default, don't build or install the comps in rtlib_legacy (for dev environments). Then, rtapi_app will first look in the legacy directory, but never load legacy comps because it's empty. If the build system is told to build and install legacy comps, or the legacy comp package is installed, the legacy directory will then be populated and rtapi_app will pick up those modules.
Link rtapi_app with one rpath only, the rtlib directory. Even if the legacy directory contains modules, they will be ignored by default. For legacy software, a special rtlib/legacy_comp_loader.so module that does the dlopen() will be built and linked with both rpath directories. After loading this module from a .hal file, the default rtapi_app module loading functions will be overridden with those from this module, and future comp loading will search both rpath directories.

Aug 03 '18 15:08 ArcEye

Comment by mhaberler Tue Feb 23 06:38:00 2016

@ArcEye - what is it exactly you cannot have be running ./configure --with-rt-preempt? IMO that takes care of your concern entirely.

Aug 03 '18 15:08 ArcEye

Comment by ArcEye Tue Feb 23 09:19:39 2016

I haven't explained myself very well.

This is not really regards the main issue here, but rather one that impacts on it and leads to the existence of flavours, where in IMHO there should not be any.

Looks like between you and @zultron, you probably have the solution to the rpath issue.

My contention is this.

99% of all posix builds are not needed or wanted, they are just a consequence of building on a machine which had a base OS installed before the MK stuff, using default ./configure settings.

You could remove the stock Debian kernel say, but it would probably be unwise for partition recovery purposes.

./configure with no switches should build only for the running kernel, be it rt or posix

--with-[kernel-flavour] works as per existing, including --with-posix

An additional --with-all-kernels switch could be used to autodetect installed kernels and build for all of them, as per the current default.

I will look at a patch of autoconfigure.in later, I may use it myself even if it is not adopted. :wink:

Aug 03 '18 15:08 ArcEye

Comment by mhaberler Tue Feb 23 09:29:11 2016

@zultron yes, binaries under flavor1...flavorN are incidentically identical, but that misses the point: this is about separation of concerns and migration

the first guy to insert #ifdef flavor == foo into a component or driver is going to blow up this scheme because this simply does not work any more, including a new class of bugs - for what, a total gain of 5MB/per flavor? My question would be: what are we optimizing here? Build time? Disk space? Or minimize impact of changes by localizing them?

Then you propose to optimize the per-flavor directories away, just to turn around and reintroduce them under a different name as a special-case mechanism like a 'legacy' directory which is exactly that - a selection scheme which could have been had with the path mechanism I propose to start with? I do not understand the rationale.

We are mixing issues here. This is about migration, not some perceived build-time optimization which has zero upside for users, besides injecting dependencies which we successfully already avoided. Can we focus on the issue at hand? Frankly, per-developer build convenience or 10MB less per build is not a good enough reason to shoot oneself into the foot, and it stands back to user convenience and minimized migration effort.

"there is possibly an issue with i386 (or possibly 32-bit architectures, incl. armv7?)": I do not see which that could be - please detail

the whole point is about migration and not rocking the boat with a big bang - which the current method would force:

the new API introduces objects which are differently created and tagged internally but are interoperable at the pin/signal level
all the existing v1 comps MUST stay around - it would be foolish to force use of v2 comps with a big bang onto everybody with NO way to step back, not even peacemeal, to isolate any errors
the name-based separation we currently employ is not a good idea because it forces a rewrite of hal configs just to switch API versions of a component, as it forces a comp name change when only a different version is required
a smart path selection mechanism just happens to fix two problems at one - three identical binaries which only differ in the rpath, and configurable preference loading. No need for per-flavor rpath if the whole scheme relies on loading relative to the current directory.
being able to load a component as v1 and v2 incarnations in parallel will be key for regression testing. We can not currently do that without a gross haque - fiddling component name by hiding the version in the comp name. So any fixed version preference scheme is counterproductive.

the way envisage things to unfold:

the default preference would be v2 components over v1, overridable globally
if there is any perceived issue you would switch to 'v1 over v2' and see if the issue persists
if it goes away you narrow down by switching individual component loads to v2 until the issue reappears - culprit identified.

The only issue I currently see in implementing this (which I partially already have) is disentangling HAL component name, base name of the component shared object, and disambiguating a component which is present both in v1 and v2 (really any number of styles) incarnation. But that is a legacy issue which should have been fixed independently of the question at hand. It might require tagging comp shared objects with the HAL comp name, through the ELF-section based rtapi_tag mechanism we already have.

That is all.

Aug 03 '18 15:08 ArcEye

Although I believe all out-of-the-box comps should build into a single directory, there may be a use case for a module directory search path when building comps out-of-tree from other projects. Two real-world examples I'm familiar with are:

LinuxCNC EMC (built against MK HAL): Makefile builds the motmod.so and *kins.so comps into ../rtlib/ for RIP builds
hal_ros_control: catkin builds the hal_hw_interface.so comp into the devel/lib/ directory (for devel builds running from the source tree) and /opt/ros/$ROS_DISTRO/lib/ directory (for system installs)

In the most general use case, a non-root user builds one or more out-of-tree comps. When building against a system-installed HAL (typically from packages), $(RTLIBDIR) in Makefile.modinc will be set to /usr/lib/linuxcnc/modules/. It's not possible to build a comp directly into $(RTLIBDIR) as a non-root user, and in a development environment it's not desirable anyway to install files outside the build directory. Fortunately, MK-HAL can loadrt a component with an absolute path (the approach we use in hal_ros_control). However, this is clumsy (esp. with many out-of-tree comps), and the LinuxCNC EMC example shows this will break configurations where out-of-tree comps are specified with EMCMOT = motmod and KINEMATICS = trivkins in the .ini file and pulled into the .hal file with e.g. loadrt [EMCMOT]EMCMOT.

In use cases where HAL configuration is in python (like hal_ros_control) or when the HAL configuration is loaded from external scripts, this is less clumsy to work around, since the install directory can be detected programmatically to munge absolute comp paths so that the out-of-tree module directory can change without affecting HAL configuration files. But this does require special programming for each loadrt or newinst statement, e.g. loadrt ${OOT_MODULE_DIR}/custom_comp. When users are expected to write their own HAL configurations (as we see in LinuxCNC), this externally visible ugliness could increase support burden for questions like, "Which comps do I need to add ${OOT_MODULE_DIR}/ and which do I not?"

System installs of out-of-tree modules aren't necessarily straightforward either. In most cases, the system install could drop new comps into /usr/lib/linuxcnc/modules/. The only wrinkle I've found here is machinekit-hal doesn't provide something like a machinekit-hal.pc file to help HAL comp build and install in build systems that can't include /usr/share/linuxcnc/Makefile.modinc (but that's for another issue).

I can mainly think obvious ways to solve the problem, like traverse $HAL_MODULE_PATH and similar. Ideas welcome.

IMO, most of the questions in this issue will be resolved by PR #227. Still outstanding is the question about how to handle v1 and v2 comps. Discussion in #161 makes me think a decision isn't far away, and it's still somewhat possible the decision could involve a second HAL comp directory. I'd like to add this one last, question here since it is related to the possibility of multiple HAL comp directories and could influence the v1 and v2 decision.

Jul 15 '19 19:07 zultron