bubblewrap icon indicating copy to clipboard operation
bubblewrap copied to clipboard

Add --disable-userns switch

Open alexlarsson opened this issue 4 years ago • 5 comments

Some usecases of bubblewrap want to ensure that the subprocess can't further re-arrange the filesystem namespace, or do other more complex namespace modification. This can be limited by --disable-userns, which makes the kernel unable to create any new user namespaces for the process hierarchy.

This is done by making a cover of the original root, but running the process with the origin root as root anyway. This "non-standard" root means the kernel will not allow creating new user namespaces.

This is more typically done using chroot("/theroot") which would also mean the root of the namespace ("/") differes from the process current root ("/theroot)". However, we want to avoid this as in this case symlinks in /proc/$pid/fd would have a "/theroot" prefix when seen outside the namespace, which is something that e.g. flatpak doesn't want.

Note, there is a slight cost to this as the covering bind mount duplicates all the regular mounts in namespace. However, they all refer to the same mounts so no actual files are duplicated.

alexlarsson avatar Oct 08 '21 12:10 alexlarsson

This was initially discussed in https://github.com/flatpak/flatpak/security/advisories/GHSA-67h7-w3jq-vh4q#advisory-comment-68447

alexlarsson avatar Oct 08 '21 12:10 alexlarsson

Sorry, I don't have the necessary knowledge of kernel subtleties to review this.

smcv avatar Oct 08 '21 20:10 smcv

Wouldn't it be more straightforward to use the max_user_namespaces sysctl?

If the user namespace is a child of the initial user namespace you could of course bump the max_user_namespaces value from 0 back up if are privileged to write to that sysctl.

But you can create an intermediary user namespace set the limit to 1 and then create the user namespace for the actual program to run in. Inside this user namespace, it is still possible to set the sysctl to some large number but the kernel enforces that any stricter max value in a parent namespace is enforced.

[lukas@PC      ~]$ sysctl user.max_user_namespaces
user.max_user_namespaces = 128026
[lukas@PC      ~]$ unshare -Ur bash
[root@PC      ~]# sysctl user.max_user_namespaces
user.max_user_namespaces = 2147483647
[root@PC      ~]# sysctl -w user.max_user_namespaces=1
user.max_user_namespaces = 1
[root@PC      ~]# unshare -Ur bash
[root@PC      ~]# sysctl user.max_user_namespaces
user.max_user_namespaces = 2147483647
[root@PC      ~]# unshare -Ur bash
unshare: unshare failed: No space left on device

lukts30 avatar Feb 18 '22 15:02 lukts30

Wouldn't it be more straightforward to use the max_user_namespaces sysctl?

This looks like a simpler way to achieve the same thing. I might try implementing it if you don't get there first.

smcv avatar Mar 22 '22 15:03 smcv

Wouldn't it be more straightforward to use the max_user_namespaces sysctl?

This looks like a simpler way to achieve the same thing.

#488 reimplements this feature with that approach.

smcv avatar Mar 22 '22 17:03 smcv

closing this in favour of https://github.com/containers/bubblewrap/pull/488

alexlarsson avatar Sep 06 '22 07:09 alexlarsson