gvisor icon indicating copy to clipboard operation
gvisor copied to clipboard

Cannot read mounts in rootless Podman

Open apyrgio opened this issue 3 years ago • 17 comments

Description

I'm trying to run gVisor within rootless Podman. The aim is to use gVisor for its syscall interception capabilities. However, I get the following error:

$ runsc -debug -network none -rootless do echo ok
Error reading mounts file: error unmarshaling mounts: unexpected end of JSON input
JSON bytes:

creating container: cannot create sandbox: cannot read client sync file: waiting for sandbox to start: EOF

In a regular Docker (not rootless) container, this command works:

$ runsc -debug -network none -rootless do echo ok
ok

Steps to reproduce

  1. Install Podman in your system.
  2. Execute into an Alpine container with podman run -it --rm alpine:edge ash
  3. Install runsc based on the installation instructions
  4. Run runsc -debug -network none -rootless do echo ok

runsc version

runsc version release-20221107.0
spec: 1.0.2-dev

docker version (if using docker)

$ podman --version
podman version 4.3.1

uname

Linux 6.0.8-arch1-1 #1 SMP PREEMPT_DYNAMIC Thu, 10 Nov 2022 21:14:24 +0000 x86_64 GNU/Linux

kubectl (if using Kubernetes)

No response

repo state (if built from source)

No response

runsc debug logs (if available)

Error reading mounts file: error unmarshaling mounts: unexpected end of JSON input
JSON bytes:

apyrgio avatar Nov 16 '22 10:11 apyrgio

I used the -debug-log switch, and I'm now seeing more logs. The following logs from the gofer process provide more info:

I1116 12:05:48.419039       1 main.go:216] ***************************
I1116 12:05:48.419097       1 main.go:217] Args: [runsc-gofer --root=/tmp/runsc-do237025772 --debug=true --debug-log=log/ --overlay=true --network=none --strace=true --rootless=true --debug-l
og-fd=3 gofer --bundle /tmp/runsc-do237025772 --spec-fd=4 --mounts-fd=5 --io-fds=6]
I1116 12:05:48.419124       1 main.go:218] Version release-20221107.0
I1116 12:05:48.419139       1 main.go:219] GOOS: linux
I1116 12:05:48.419153       1 main.go:220] GOARCH: amd64
I1116 12:05:48.419168       1 main.go:221] PID: 1
I1116 12:05:48.419183       1 main.go:222] UID: 0, GID: 0
I1116 12:05:48.419198       1 main.go:223] Configuration:
I1116 12:05:48.419214       1 main.go:224]              RootDir: /tmp/runsc-do237025772
I1116 12:05:48.419231       1 main.go:225]              Platform: ptrace
I1116 12:05:48.419246       1 main.go:226]              FileAccess: exclusive, overlay: true
I1116 12:05:48.419262       1 main.go:227]              Network: none, logging: false
I1116 12:05:48.419279       1 main.go:228]              Strace: true, max size: 1024, syscalls: 
I1116 12:05:48.419293       1 main.go:229]              LISAFS: true
I1116 12:05:48.419308       1 main.go:230]              Debug: true
I1116 12:05:48.419322       1 main.go:231]              Systemd: false
I1116 12:05:48.419336       1 main.go:232] ***************************
I1116 12:05:48.419352       1 main.go:254] Failed to set RLIMIT_MEMLOCK: operation not permitted
W1116 12:05:48.420557       1 specutils.go:113] noNewPrivileges ignored. PR_SET_NO_NEW_PRIVS is assumed to always be set.
W1116 12:05:48.420759       1 util.go:64] FATAL ERROR: error mounting proc: operation not permitted
error mounting proc: operation not permitted

apyrgio avatar Nov 16 '22 12:11 apyrgio

I see that the relevant check in Gofer happens here:

https://github.com/google/gvisor/blob/d8aa09e04c4e38155dfcee6ed0495c2b29f604fc/runsc/cmd/gofer.go#L418

Since we are already running in rootless Podman, we may opt to reuse the existing /proc and / paths. My understanding is that this can happen via the -TESTONLY-unsafe-nonroot flag, and it works indeed:

$ runsc -network none -rootless -TESTONLY-unsafe-nonroot do echo ok
ok

Still, this is a pretty scary looking argument. Is it safe using it in a sandbox within a sandbox use case? Else, is there a better way to achieve the original goal?

The aim is to use gVisor for its syscall interception capabilities

apyrgio avatar Nov 16 '22 12:11 apyrgio

A friendly reminder that this issue had no activity for 120 days.

github-actions[bot] avatar Sep 13 '23 00:09 github-actions[bot]

@apyrgio have you found another solution that doesn't rely on -TESTONLY-unsafe-nonroot by now?

felschr avatar Oct 01 '23 16:10 felschr

Unfortunately not. I haven't managed to work more in this front actually.

apyrgio avatar Oct 16 '23 14:10 apyrgio

I think this issue was fixed by https://github.com/google/gvisor/commit/c6a1db5baec7616983b14ac06e84bee45330a9d3. Can you please confirm?

ayushr2 avatar Oct 16 '23 14:10 ayushr2

Doesn't look like it I'm afraid. The original invocation (runsc -debug -network none -rootless do echo ok) still fails with:

I1110 09:32:21.865885       1 main.go:189] ***************************
I1110 09:32:21.865967       1 main.go:190] Args: [runsc-gofer --root=/tmp/runsc-do4154063816 --debug-log=logs/ --overlay2=all:memory --network=none --rootless=true --debug-log-fd=3 gofer --bundle /tmp/runsc-do4154063816 --gofer-mount-conf
s=1 --spec-fd=4 --mounts-fd=5 --io-fds=6]
I1110 09:32:21.865995       1 main.go:191] Version release-20231106.0
I1110 09:32:21.866008       1 main.go:192] GOOS: linux
I1110 09:32:21.866020       1 main.go:193] GOARCH: amd64
I1110 09:32:21.866032       1 main.go:194] PID: 1
I1110 09:32:21.866045       1 main.go:195] UID: 0, GID: 0
I1110 09:32:21.866058       1 main.go:196] Configuration:
I1110 09:32:21.866070       1 main.go:197]              RootDir: /tmp/runsc-do4154063816
I1110 09:32:21.866082       1 main.go:198]              Platform: systrap
I1110 09:32:21.866095       1 main.go:199]              FileAccess: exclusive
I1110 09:32:21.866110       1 main.go:200]              Directfs: true
I1110 09:32:21.866123       1 main.go:201]              Overlay: all:memory
I1110 09:32:21.866137       1 main.go:202]              Network: none, logging: false
I1110 09:32:21.866152       1 main.go:203]              Strace: false, max size: 1024, syscalls: 
I1110 09:32:21.866165       1 main.go:204]              IOURING: false
I1110 09:32:21.866178       1 main.go:205]              Debug: false
I1110 09:32:21.866190       1 main.go:206]              Systemd: false
I1110 09:32:21.866202       1 main.go:207] ***************************
W1110 09:32:21.867751       1 specutils.go:124] noNewPrivileges ignored. PR_SET_NO_NEW_PRIVS is assumed to always be set.
W1110 09:32:21.868016       1 util.go:64] FATAL ERROR: error mounting proc: operation not permitted
error mounting proc: operation not permitted

(note that the line Failed to set RLIMIT_MEMLOCK: operation not permitted is no longer present, in contrast with the previous logs)

Also, the altered invocation (runsc -network none -rootless -TESTONLY-unsafe-nonroot do echo ok) still works.

apyrgio avatar Nov 10 '23 09:11 apyrgio

I think this issue was fixed by c6a1db5. Can you please confirm?

@ayushr2 This commit code is after the bug occurs, so it can't fix this issue.

terenceli avatar Dec 07 '23 04:12 terenceli

@apyrgio is selinux enabled? If the answer is yes, could you try to temporary disable it and reproduce the issue?

$ sestatus $ sudo setenforce Permissive

avagin avatar Dec 07 '23 20:12 avagin

@apyrgio @terenceli could you try out https://github.com/google/gvisor/pull/9798?

avagin avatar Dec 07 '23 23:12 avagin

@apyrgio @terenceli could you try out #9798?

@avagin in my case it doesn't work.

I did some analysis, the issue is happen in here: https://github.com/google/gvisor/blob/master/runsc/cmd/gofer.go#L394C21-L394C21

Your fix code is after it.

This issue seems to be a general issue 'mount procfs from unprivileged container'. It is discussed in runc issue https://github.com/opencontainers/runc/issues/1658

I used Alban's workaround in here https://github.com/opencontainers/runc/issues/1658#issuecomment-375750981

and It works.

Following is my Ubuntu test.

The /mnt/proc volume is in Alban's workaround.

# docker run -it --security-opt apparmor:unconfined --security-opt seccomp=unconfined  -v /home:/home -v /mnt/proc:/newproc -v /tmp:/tmp ubuntu
root@561ca9b97a06:/# useradd test
root@561ca9b97a06:/# su test
$ cd /tmp
$ ./runsc -rootless do sh
*** Warning: sandbox network isn't supported with --rootless, switching to host ***
#

So my question, could we find a more elegant fix/workaround for we can use gVisor in unprivileged docker/podman environment? I rely this to build a process-level sandbox.

terenceli avatar Dec 08 '23 01:12 terenceli

Your fix code is after it.

You are right. I created a fedora vm to reproduce the issue, but runsc failed differently there.

This issue seems to be a general issue 'mount procfs from unprivileged container'. It is discussed in runc issue opencontainers/runc#1658

Thank you for researching the problem. I think https://github.com/google/gvisor/commit/063ee51c57f6cd5c64aa0d115396941dce455b8b should fix the issue. Could you try it out?

avagin avatar Dec 11 '23 01:12 avagin

Your fix code is after it.

You are right. I created a fedora vm to reproduce the issue, but runsc failed differently there.

This issue seems to be a general issue 'mount procfs from unprivileged container'. It is discussed in runc issue opencontainers/runc#1658

Thank you for researching the problem. I think 063ee51 should fix the issue. Could you try it out?

@avagin Yes, after apply this patch, it works! Thanks

terenceli avatar Dec 11 '23 02:12 terenceli

Oh, that's great! Thanks a lot for the progress on this front.

@terenceli, just to be sure, did this fix work within a rootless Podman container, as described at the top of the issue? If not, I can give it a go as well.

apyrgio avatar Dec 11 '23 17:12 apyrgio

Oh, that's great! Thanks a lot for the progress on this front.

@terenceli, just to be sure, did this fix work within a rootless Podman container, as described at the top of the issue? If not, I can give it a go as well.

@apyrgio Sorry, I don't try it now. I have tried it in Ubuntu and docker env. You can test it in Podman.

terenceli avatar Dec 12 '23 00:12 terenceli

@apyrgio I have tried this patch using podman. With selinux set to Permissive, it works, and with selinux enabled, it doesn't work.

@avagin Could you summit a PR to upstream?

[test@debug010000002015 ~]$ podman --version
podman version 4.8.1
[test@debug010000002015 ~]$ podman run -it --rm alpine ash
/ # cd /tmp
/tmp # ls
runsc
/tmp # ./runsc  do ls
Error setting up network: failed to run "/sbin/ip link add ve-runsc-387325 type veth peer name vp-runsc-387325": exit status 2
/tmp # ./runsc  --rootless do ls
*** Warning: sandbox network isn't supported with --rootless, switching to host ***
runsc               runsc-do3020874715

terenceli avatar Dec 20 '23 08:12 terenceli

Thanks a lot @terenceli. Unfortunately I didn't have time to test it on my own, but the results look promising! For our use case (Dangerzone) we are spinning up containers that don't have a network device, so it could very well be that nested gVisor will work, even with SELinux enabled. That's a great development!

apyrgio avatar Dec 21 '23 19:12 apyrgio