geoip-shell OpenRC persistence

As discussed on https://github.com/friendly-bits/geoip-shell/issues/14 this patch series detects when busybox init is actually OpenRC init, such as on Alpine Linux. inittab is used to detect when OpenRC is actually the init system, not just when it's installed.

Install starts the OpenRC local.d service as a replacement for busybox crond not supporting @reboot and creates a startup persistence script. Uninstall removes the persistence script but does not stop local.d in case the user is using it for something else.

Tested okay on Alpine Linux 3.20.

Documentation adds the OpenRC change, and provides manual steps for other init systems where persistence isn't automated yet.

Aug 16 '24 11:08 superjamie

Thank you for submitting the PR. Before reviewing the proposed changes:

Currently geoip-shell implements persistence via an init script on OpenWrt as an exclusion to the general rule of implementation via cron. Adding support for another init system, and potentially further support for a third one (systemd), would add too many exclusions and would make the code too messy. For this reason I will be implementing a generic way of handling init scripts installation, checks and uninstallation. I can do this either after accepting this PR (then some of the contributed code will have to be changed by me), or before that (then the PR will need to be adapted). What is your preference?
Ideally, the call to geoip-shell-run.sh should be initiated not just on boot but on iptables/nftables restart/reload events as well. I'm not sure if this can be implemented via OpenRC. I started reading up on it but no conclusion yet. Maybe you happen to have an idea? The reason is, if another service decides to restart the firewall backend, geoip-shell rules may get removed, so implementing an automatic recovery for this scenario would be ideal.
Ideally, the call to geoip-shell-run.sh should be initiated when the network is online. geoip-shell has configurable backup feature. When enabled, calling geoip-shell-run.sh restore will restore rules and ipsets from the backup. When disabled, or when restore from backup fails for some reason, a fallback is implemented which will re-fetch the ip lists and then apply them. Obviously this fallback will fail if there is no network. So best if there is a way to detect the 'network online' event and call geoip-shell-run.sh after some delay (say 20 seconds), or if that's difficult then set an arbitrary delay from boot (say 30 seconds).

Aug 16 '24 12:08 friendly-bits

I am fine if you'd like to pull this in now and rewrite init detection at your leisure. That seems easier for us both.
To my knowledge, OpenRC doesn't offer this feature. The only init system I know which can do this is systemd with its service unit overrides in /etc and the ExecStartPost= parameter. Needing to manually restart other things after restarting the firewall is a pretty classic problem, libvirtd and Docker still suffer from this. With OpenRC this should just be a matter of service local restart which will run the restore script again.
I did consider this point and I think it's handled. OpenRC has different ordered runlevels boot and default. OpenRC service scripts can specify their own soft/hard dependency order too. The local.d service is added to default which happens after the networking service has been started, so afaik when the geoip-shell restore script runs via local.d, the system should have all network interfaces up.

Aug 16 '24 14:08 superjamie

OpenRC runlevels appear to be configurable, so a user could define their own runlevels and use them however they like, and/or set a different runlevel as the default, rather than the one named 'default'.

I may be overly pedantic but I think it's a good idea to not assume that the runlevel named 'default' is in fact the default on a given system. Looks like the relevant config file is /etc/rc.conf. This file in my live Alpine system has this line: #rc_default_runlevel="default" So I believe that this is where the default runlevel may be changed.

Something like this should detect if the system is using a different runlevel as the default, rather than the 'default' runlevel:

detect_openrc_def_runlevel() {
	# check for 'rc_default_runlevel' without preceding '#' in /etc/rc.conf
	rc_def_runlevel="$(sed -n '/^[ \t]*rc_default_runlevel[ \t]*=/{s/#.*//;s/^.*=//;s/[ \t]*$//;s/\"//g;p;}' /etc/rc.conf | tail -n1)"
	: "${rc_def_runlevel:=default}"
	set +f # enable globs
	# check if /etc/runlevels has a subdirectory with a matching name
	cd /etc/runlevels && echo */ | grep -E "[ \t]+$rc_def_runlevel/" && { set -f; return 0; }
	set -f # disable globs
	rc_def_runlevel=
	return 1
} 1>/dev/null 2>/dev/null

(only tested on Alpine for now, test on Gentoo pending) (Edit: improved sed expression, added the tail command)

Aug 17 '24 04:08 friendly-bits

Hi @superjamie , are you still interested in this PR?

Aug 25 '24 22:08 friendly-bits

Hey, yes I am.

I must admit that sed expression is a bit beyond me and I'm not comfortable committing something I don't understand. That would not be very responsible of me.

I also would do it differently, what do you think about this?

detect_openrc_def_runlevel() {
    rc_def_runlevel=$(sed -n -e '/^\s*rc_default_runlevel/{s/rc_default_runlevel\s*=\s*//;s/\"//g;p;}' /etc/rc.conf)
    [ ! -z "$rc_def_runlevel" ] && [ -d "/etc/runlevels/${rc_def_runlevel}" ] || rc_def_runlevel="default"
}

If we're this far into the logic then we must have openrc so I don't see the point of failing detection. This function either must find a configured runlevel with directory or defaults to openrc's default.

Alternatively if you prefer your method, I am perfectly happy if you wish to merge this then make that change, or modify this PR to include it.

Aug 26 '24 10:08 superjamie

Actually I'm not sure any of that's necessary at all.

You can just rc-update add local and openrc applies the correct default runlevel.

That seems a better solution?

Aug 26 '24 10:08 superjamie

You can just rc-update add local and openrc applies the correct default runlevel.

This definitely sounds like a good solution, however changing the default runlevel in /etc/rc.conf to sysinit, then running rc-update add local produces the output * rc-update: local already installed in runlevel 'default'; skipping.

So either something doesn't immediately pick up the default runlevel change in /etc/rc.conf, or the command rc-update add local is hardcoded to use default when no argument is specified. Perhaps changing the value in /etc/rc.conf and then rebooting would clarify this but restarting the VM in which my instance is running won't do it because it's a "live CD", i.e. not installed. So the change would not get saved. Perhaps you could test it by cloning the default runlevel to a different name, making it the default in /etc/rc.conf, rebooting and checking the output of rc-update add local?

Aug 26 '24 14:08 friendly-bits

[ ! -z "$rc_def_runlevel" ] && [ -d "/etc/runlevels/${rc_def_runlevel}" ] || rc_def_runlevel="default"

This is much cleaner than my solution, for sure. It only has one caveat: it assumes that the runlevel default is present in the system. Which with OpenRC is not necessarily the case. I'd rather error out if the detected default runlevel is not present (including if it's called default) than return an incorrect value.

This level of pedanticism might be slightly irritating, but I'll just mention that the reason you knew that geoip-shell doesn't support cron-based persistence with Busybox cron is because I specifically checked where potential compatibility problems could be, then coded a check and an error message for this specific case. If I didn't, the installer would report success and you would not have had any way to know that persistence is not working, rather than empirically. So your case essentially justifies the pedantic approach :)

Aug 26 '24 14:08 friendly-bits

I must admit that sed expression is a bit beyond me and I'm not comfortable committing something I don't understand. That would not be very responsible of me.

This is understandable. I can explain what each part of it does if that helps and you'll see that it's much less scary than it looks.

Starting with the complete command: sed -n '/^[ \t]*rc_default_runlevel[ \t]*=/{s/#.*//;s/^.*=//;s/[ \t]*$//;s/\"//g;p;}'

The -n switch in the beginning tells sed to not print lines, unless specifically asked to (via the command p which means print).

Everything between the two / is a BRE regex which I assume you know how to read (if not, I can explain that). Essentially filtering for lines which have the rc_default_runlevel key in them without any preceding non-whitespace, followed by =.

Immediately after the closing / of the regex, there is { which says what to do when a matching line is encountered. The commands before the closing } are executed.

s is a substitute command, in the format s/[regex]/[replacement_string]/. Optional g after the last / means replace all instances rather than just the first one. ; separates commands from each other

There are 4 s commands which do the following:

Remove # and anything following (in case of in-line comments)
Remove anything before and including = - this removes the key.
Remove any trailing whitespaces and tabs
Remove any double-quotes

If the entry was reasonably formatted, this will result in a value for the key we're interested in, which we print with p. The tail command is there to make sure that if there were several matching entries, only the last one is registered.

Aug 26 '24 15:08 friendly-bits

This level of pedanticism might be slightly irritating

Actually, not at all. You have a clear vision for your project and communicate well. I now understand your intention a lot more. It's really appreciated :)

Perhaps you could test it by ... rebooting and checking the output of rc-update add local?

Turns out this doesn't work as I expected. rc-update add local puts the local servce in the current runlevel, not the default runlevel.

Reading the openrc source, that is how it works by design. So that's a bust.

I should have clarified, the BRE was the difficult part. I have always written anything non-trivial in ERE. But that's no problem. Read on...

it assumes that the runlevel default is present in the system.

I understand your wish to fail detection. My idea here is that the system MUST have a default runlevel which we can detect. It wouldn't boot if it didn't have one! It should be possible to write a function which never fails to find the default runlevel.

Our options are:

An uncommented rc_default_runlevel in rc.conf, which we can extract from that file as needed. That's easy.

The default compiled-in runlevel which OpenRC call "default". However, someone could theoertically recompile their OpenRC to have a different default. The binary does not offer a way to print that compiled-in default runlevel.

However, the system boots with inittab, where the default runlevel will be like this (example of a default runlevel changed to "carrot"):

::sysinit:/sbin/openrc sysinit
::sysinit:/sbin/openrc boot
::wait:/sbin/openrc carrot

So we can detect the default runlevel this way.

afaics that gives us the following which always finds the system's actual default OpenRC runlevel without fail:

rc_def_runlevel="$(sed -n -e '/wait:.*openrc/{s/^.*openrc\s\+//;p}' /etc/inittab)"

That would mean a change in the existing PR to:

    printf '\n%s' "Attempting to enable local.d service..."
    rc_def_runlevel="$(sed -n -e '/wait:.*openrc/{s/^.*openrc\s\+//;p}' /etc/inittab)"
    /sbin/rc-update add local "$rc_def_runlevel"

How do you feel about this way?

Aug 27 '24 10:08 superjamie

Hi @friendly-bits just wanted to follow up in case you missed the last message, no rush :)

Sep 10 '24 00:09 superjamie

Thank you. I'm kinda busy with other stuff rn so couldn't attend to this yet. May need to wait till the end of this week.

Sep 10 '24 06:09 friendly-bits

The /etc/inittab approach has a problem: it relies on that file being used (by another init system, such as Busybox or sysvinit), which may not be the case for all systems with OpenRC. In fact, OpenRC can be run from its own init process: https://wiki.gentoo.org/wiki/OpenRC/openrc-init Which doesn't use /etc/inittab.

Detecting whether inittab is used or not adds more complexity which I'd like to avoid.

So perhaps the simplest way forward would be to rely on the user installing geoip-shell in the same runlevel which they normally run the machine in, and adding this to the init script:

depend() {
    need net
}

https://wiki.gentoo.org/wiki/Handbook:X86/Working/Initscripts#Writing_initscripts

So we could perhaps detect the current runlevel and use it for the init script, write it to the config, and print the name of the runlevel to the console during installation. If the user wants to change the runlevel after installation, they could edit the relevant option in the config file. Optionally we could implement an interactive dialog which would enable the user to change the runlevel without manually editing the config. Probably not worth the effort, since in the vast majority of cases I expect that the user wouldn't need to change the runlevel geoip-shell runs in.

Sep 12 '24 01:09 friendly-bits

OpenRC can be run from its own init process

Nice find. I'll modify my first commit to detect openrc-init from /proc/cmdline too.

So perhaps the simplest way forward would be to rely on the user installing geoip-shell in the same runlevel which they normally run the machine in

I agree that assumption is both the simplest and most reasonable way.

and adding [depend() { need net }] to the init script

That won't work as-is, because the restore functionality is run as a #!/bin/sh script, not a #!/sbin/openrc-run script.

Unless you're suggesting to change from an OpenRC local.d script to making a full OpenRC initscript to perform the restore? I don't think that's necessary. You already have a great script which handles the restore command, it would needlessly complex to write a full initscript just to call the existing restore command.

net is available by the time local.d scripts run, because local has depend() { after * }.

Sep 12 '24 03:09 superjamie

A useful feature of the need directive is that when specified services are restarted, that will also restart geoip-shell. I was actually thinking about adding firewall to the dependencies, which would trigger geoip-shell restart on firewall restart, thus restoring the functionality automatically when the firewall is restarted. Not sure which keyword to use for the need directive (I assume the keyword should be found in the provides directive of firewall-providing init scripts). This will probably require a full-blown init script, but is it so hard to put together?

Sep 12 '24 03:09 friendly-bits

This has become longer than I have the spare time for. A one-line local script is now detecting an init system in a distro I don't use and writing a full initscript for an init system I am not really that familiar with.

If any of this has been useful to you, please feel free to use it. I elevate all I've done here to the public domain under Creative Commons Zero and waive all rights.

Thank you for your script and all the best.

Sep 13 '24 09:09 superjamie

I understand. I will try to implement this feature by myself. Your ideas are appreciated and will definitely be useful.

Sep 13 '24 12:09 friendly-bits