Allow rootless execution of repro
These changes allow to run repro without root privileges. Due to some issues, the resulting builds are not reproducible. This mode can be enabled with the -r flag
repro -r ...
Requirements
- User Namespaces have to be allowed (
sysctl kernel.unprivileged_userns_clone=1) - The user needs to have enough subuid/gid assigned in
/etc/subuidand/etc/subgid. I used the following for both files:progandy:100000:65536 -
become-roothas to be installed. nsjail did deny/proc/self/setgroupswhich caused issues with sudo/su. Maybe it is a kernel-bug thatbecome-rootworks without that setting. I did not understand that section in the user_namespaces manpages completely. It may be possible to get sudo (not su) working with setgroups disabled with thepreserve_groupsoption. -
nsjailis required instead of systemd-nspawn -
fuse-overlayfsis required since overlayfs is not allowed for unprivileged namespaces in the upstream kernel. -
unsharefrom util-linux. I use it to create a mount namespace for the container and as pid1 in the nsjail commands. The mount namespace might not be necessary and for pid1 it may be better to use dumb-init. I don't think it would be a good idea to directly run e.g. pacman as pid1, and I don't know about using bash.
Known Issues
- In the build container, a downgrade of archlinux-keyring seems to fail signing keys, each operation times out, but the pacman command completes successfully.
Installing the root container works fine.( 2/126) downgrading archlinux-keyring [###############################################] 100% ==> Appending keys from archlinux.gpg... ==> Locally signing trusted keys in keyring... -> Locally signing key D8AFDDA07A5B6EDFA7D8CCDAD6D055F927843F1C... ==> ERROR: D8AFDDA07A5B6EDFA7D8CCDAD6D055F927843F1C could not be locally signed.(48/54) upgrading archlinux-keyring [##############] 100% ==> Appending keys from archlinux.gpg... ==> Locally signing trusted keys in keyring... -> Locally signing key D8AFDDA07A5B6EDFA7D8CCDAD6D055F927843F1C... -> Locally signing key DDB867B92AA789C165EEFA799B729B06A680C281... -
fuse-overlayfs silently fails changing the modification time of symlinks.
(https://github.com/containers/fuse-overlayfs/issues/204)
│ │ -lrwxrwxrwx 0 root (0) root (0) 0 2020-04-07 11:35:48.000000 usr/bin/rnano -> nano │ │ +lrwxrwxrwx 0 root (0) root (0) 0 2020-04-24 19:42:54.000000 usr/bin/rnano -> nano
Additional notes
I changed systemd-machine-id-setup to run inside the container, so rootless repro might run without systemd installed.
Nice! Any chance you can submit cefd9a3 separately? It's a lot easier to land and would resolve #69.
Nice! Any chance you can submit cefd9a3 separately? It's a lot easier to land and would resolve #69.
Done: #73
become-root should get one release before we start relying on it. Perhaps you could ask the author.
Maybe I can get away without become-root. I don't want to add too many dependencies. That would mean repro would have to parse /etc/subuid or rely on the user to specify the allowed values.
The symlink issue will be fixed with containers/fuse-overlayfs#205.
The missing .MTREE might be a race condition in overlayfs or bash globs? It seems to work pretty well if I add sync .MTREE ; sleep 0.1 before the call to list_package_files in makepkg.
pacman-key --populate failed since the gpg-agent sockets were not properly cleaned up. and fuse-overlayfs could neither overwrite nor delete those inaccessible sockets. ~~I cannot reproduce it at the moment.~~ Now it is happening again.
I'm a bit skeptical of adding become_root as It's more dependencies which needs to be packaged. It's a bit unclear what functionality we'd need to replace though. Parsing /etc/subuid shouldn't be too hard in of itself.
nsjail always denies setgroups for the container. That means that su won't work and sudo only if it is configured with preserve_gropus. become-root did not have that problem, but I wonder if that is considered a bug du to this and this.
The fuse-overlayfs issue is fixed. Nice. So I'll try take a dig at how nsjail works and how we can try land this :)
Here are some more tests I did. I'm not sure if that is the best way, though. Maybe there should only be one nsjail/nspawn command that in turn runs a custom script instead of calling it for each command separately.
As a sidenote, there is also https://github.com/containers/bubblewrap which tries some of the same things like nsjail.
Also, If you are able to rebase ontop of master it would be easier for me to test things.
It should be rebased already.
~~As far as I know, bubblewrap requires the SUID bit or has that changed?~~ I was trying to avoid requiring administrative as much as possible, but at minimum they are necessary to add the subuid/subgid entries with usermod.
Seems like bubblewarp can operate without SUID: https://github.com/containers/bubblewrap/blob/ff533b84d056f2c22633a84b34323dd085bd977a/bwrap.xml#L57
It's not rebased :)
Ah, rebased onto the wrong master branch. Now it should be ready.
Thanks for the work! I'll mark this as next on my todo and consider patching our fuse-overlayfs package
For testing purposes I have published fuse-overlayfs-1.0.0-2 which adds the symlink patch
Zomg! It works!
I managed to rebuild archlinux-keyring!
There are a few issues which I need to look at. /build in the container is created with uid 166535 instead of the correct one and missing /etc/resolv.conf so nameservers are not available. I have a few changes locally at the moment.
λ devtools-repro rootless» git diff
diff --git a/repro.in b/repro.in
index 759c4a9..d06b437 100755
--- a/repro.in
+++ b/repro.in
@@ -84,7 +84,7 @@ function require_userns_tools() {
function mountoverlay() {
if ((rootless_userns)); then
- ~/Projekte/fuse-overlayfs/fuse-overlayfs "$@"
+ fuse-overlayfs "$@"
else
mount -t overlayfs overlayfs "$@"
fi
@@ -285,6 +285,11 @@ function init_chroot(){
mkdir -p "$BUILDDIRECTORY/root"
tar xvf "$IMGDIRECTORY/$bootstrap_img" -C "$BUILDDIRECTORY/root" --strip-components=1 > /dev/null
+ # Custom rootless_userns configs here
+ if ((rootless_userns)); then
+ cat /etc/resolv.conf > "$BUILDDIRECTORY"/root/etc/resolv.conf
+ fi
+
printf 'Server = %s\n' "$HOSTMIRROR" > "$BUILDDIRECTORY"/root/etc/pacman.d/mirrorlist
printf '%s.UTF-8 UTF-8\n' en_US de_DE > "$BUILDDIRECTORY"/root/etc/locale.gen
printf 'LANG=en_US.UTF-8\n' > "$BUILDDIRECTORY"/root/etc/locale.conf
function build_package(){
## [snip]
mkdir -p "./build"
for pkgfile in "$BUILDDIRECTORY/$build"/pkgdest/*; do
mv "$pkgfile" "./build/"
done
chown -R "$src_owner" "./build"
}
This logic doesn't work for rootless container. The persmission we end up with are from the container it seems
-rw-r--r-- 1 166535 166535 912011 Apr 30 22:45 archlinux-keyring-20200422-1-any.pkg.tar.zst
This is fixed by setting src_owner to "root" after doing the initial nsjail
This logic doesn't work for rootless container. The persmission we end up with are from the container it seems
-rw-r--r-- 1 166535 166535 912011 Apr 30 22:45 archlinux-keyring-20200422-1-any.pkg.tar.zst
While inside the user-namespace, it should be chowned to root, which will be mapped to the id of the user that created the namespace. I guess --keep-env in the first nsjail command causes problems. Either remove that and only pass necessary environment variables along or explicitly set $USER to root or something like that.
Current todo for this draft
- [ ] Rebase against latest master still missing (also reindent from tab to 4 spaces)
- [ ] Need to figure out what to do with
/etc/resolv.conf
Any other things from your side which needs testing and/or some improvements? If the restructuring takes too much time or is complicated I don't think it's needed for this run.