bindfs icon indicating copy to clipboard operation
bindfs copied to clipboard

Nondeterministic EACCES writing to file inside the mirror

Open G3zz opened this issue 3 years ago • 18 comments

I am using bindfs to mount some python code from a subdirectory (parent/directoryToMount) in user A's home folder to an analogous location in user B's home directory (parent/mountedDirectory). userA has the allow_other option enabled and the permissions of the parent folders are:

ls -l /home/userA/parent
drwx------ 99 userA userA 4096 Sep 21 15:12 parent
ls -l /home/userB/parent
drwxrwx--- 2 userB userA 4096 Sept 21 15:22 parent

To mount the subdirectory, the below is executed as userA:

mkdir -p /home/userB/parent/mountedDirectory
bindfs --perms=770 --create-with-perms=770 --mirror=userB /home/userA/parent/directoryToMount /home/userB/parent/mountedDirectory

All seemingly works well - the Python code executes and starts to read/write/append files in the mirrored parent/mountedDirectory until after some seemingly non-deterministic amount of time, the Python program can no longer write files and receives Permission denied.

Here is an excerpt from running with strace, which shows that at some point in the loop, the Python program is no longer able to write to a file in the mountedDirectory called data.csv:

895   openat(AT_FDCWD, "data.csv", O_WRONLY|O_CREAT|O_APPEND|O_LARGEFILE|O_CLOEXEC, 0666) = 16
...
895   openat(AT_FDCWD, "data.csv", O_WRONLY|O_CREAT|O_APPEND|O_LARGEFILE|O_CLOEXEC, 0666) = -1 EACCES (Permission denied)

Unfortunately I am not able to share the full Python code but I'd like to check if there's anything obvious that is incorrect and any ideas on how to get more information on the problem.

Thanks

G3zz avatar Sep 21 '22 15:09 G3zz

That's weird.

Some things to check and try:

  • bindfs never returns EACCES for open by itself, but it can forward any error from the underlying FS. See man 2 open for possible reasons. Is it e.g. possible that your Python program changes its working directory at some point? (Does strace show a chdir?)
  • Run bindfs with -d to get debug output. If you can't see open or create calls after the failures start, then the calls are rejected somewhere at the FUSE / kernel level.
  • If you're on FUSE 2, try compiling bindfs with FUSE 3 (if available for your distro).

mpartel avatar Sep 21 '22 16:09 mpartel

Thanks for your response, I'll try some of these out and respond back in due cause

G3zz avatar Sep 23 '22 10:09 G3zz

Just to say I have been looking into this. The strace and bindfs -d logs don't scream anything unusual to me - there is no chdir. The error causes the Python program to shutdown so it's not possible to get much information from bindfs -d - although right up until the EACCES error it looks to be working fine.

The program (which I can't share unfortunately) includes logic to monitor the disk space of the directory (using os.listdir("relative/path"), so results in quite frequent calls to stat64. I can see in the strace logs that the file that eventually causes the EACCES error is returned by stat64 many times successfully previously, leading me to wonder if there is some kind of rate-limiting or something going on?

I realise this isn't much to go on, but I copied the python code into the mirror directory and ran it hundreds of times without issue. I still have to try with FUSE 3, but otherwise I don't think I will be able to continue using bindfs unfortunately.

G3zz avatar Nov 24 '22 16:11 G3zz

Thanks for reporting back. It's a weird one. No new ideas unfortunately, besides checking dmesg if you haven't yet.

mpartel avatar Nov 27 '22 08:11 mpartel

Dealing with what I believe to be the same issue when syncing large directories with unison on top of bindfs.

Came up with a simple bash reproduction.

Be advised this is very hard to reproduce on my fast NVME drive. Much easier on a slower SATA SSD.

It's also much easier to reproduce with large real-world directories, though I'm using lots of empty files here to keep the reproduction simple and contained.

Setup:

mkdir source target
bindfs --mirror=user1,user2 source target
mkdir target/sub
cd target/sub && for f in {1..150000}; do touch $f; done

In one terminal in top-level 'target', change to a mirrored user who is not the bindfs process owner and stat really hard:

while true; do find . -print0 | xargs -P 10 -0 stat; done > /dev/null

With the mad-statter™ running: in another terminal in 'target', run the following repeatedly until you see a permission error (I did this as the bindfs process owner):

strace -o /tmp/strace1 rm -f foo; strace -o /tmp/strace2 touch foo

I've seen EACCES produced from openat,unlink, and stat thus far. If it gets bad enough, simply running 'ls' in the target directory will produce errors.

wentam avatar Feb 11 '23 11:02 wentam

Thanks for the repro! Hasn't worked for me yet, but I'll try it on a HDD later. Which distro and FUSE version is this?

mpartel avatar Feb 11 '23 12:02 mpartel

Thanks for the repro! Hasn't worked for me yet, but I'll try it on a HDD later. Which distro and FUSE version is this?

NixOS fuse 2.9.9 bindfs 1.17.1

Be sure that you've created the files in a subdirectory of target and you're running while true; do find . -print0 | xargs -P 10 -0 stat; done > /dev/null in the top-level of the target (not in the subdirectory), and as a different user than the bindfs process (of which is in the mirror list).

No idea why, but those conditions seem to need to be met for the error to be produced.

wentam avatar Feb 11 '23 12:02 wentam

Thanks! If possible, please try compiling bindfs against FUSE 3.x.

Actually I suspect the disk's speed doesn't matter for the rerpo, since it should all be cached by the kernel. Likely I'll need to try this on a slower computer or VM :thinking:

mpartel avatar Feb 11 '23 14:02 mpartel

Thanks! If possible, please try compiling bindfs against FUSE 3.x.

Actually I suspect the disk's speed doesn't matter for the repo, since it should all be cached by the kernel. Likely I'll need to try this on a slower computer or VM thinking

Tried with fuse 3.11 (This is as simple as bindfs.overrideAttrs (old: { buildInputs = [ fuse3 ]; }) with NixOS). Identical behavior, still producing errors.

The device I've primarily been testing this on is indeed on the slower end, but nothing crazy slow (intel i5-7300U DDR4).

wentam avatar Feb 11 '23 14:02 wentam

Doesn't repro on a very slow AMD GX-412TC / Debian 11 either.

I've seen EACCES produced from openat,unlink, and stat thus far.

This still feels like something outside bindfs is doing some questionable caching, but I don't know how to get at it. Try -o entry_timeout=0 maybe? (man mount.fuse3 has options that ~every FUSE FS supports.)

mpartel avatar Feb 12 '23 08:02 mpartel

This still feels like something outside bindfs is doing some questionable caching, but I don't know how to get at it. Try -o entry_timeout=0 maybe? (man mount.fuse3 has options that ~every FUSE FS supports.)

No change with -o entry_timeout=0.

With a bit of tweaking, I've managed to create a script that quickly and reliably produces this problem on every machine I've tried, including fast and slow ones: https://gist.github.com/wentam/3a175c71a2f535e1606bb40e0f1aef58

The script will handle everything, including starting bindfs. Using a script like this also helps eliminate any slight differences in setup that could matter.

After setup stage completes, takes less than 1s on every machine I've tried. Also tried on an arch linux machine, still reproduced the error.

$ cd /tmp/work/
/tmp/work/ $ sh bindfs-120-repro.sh testdir [some_other_user]

wentam avatar Feb 12 '23 08:02 wentam

Thank you for the very helpful script! Not sure what I did wrong previously, but the error now happens on my slow machine (Debian 11, kernel 5.10.0-21 fuse 2.9.9). But it does not happen my fast machine (Pop_OS 22.04, kernel 6.0.12, fuse 3.10.5). Both have the latest bindfs from git.

The kernel still seems to get a steady stream of FUSE-related changes, so I'm tempted to suspect a bug in older kernels. What kernel version are you running?

mpartel avatar Feb 12 '23 10:02 mpartel

What kernel version are you running?

Kernel is 6.1.9 (and have also repro'd on devices with many different kernel versions)

wentam avatar Feb 12 '23 10:02 wentam

Ok, I've now seen it not repro in a Debian 11 VM (or Ubuntu 22.04 or 22.10) on the fast machine where it doesn't repro natively either. So the repro seems to be machine-dependent rather than software version dependent.

I'll see if I can find a more minimal test case...

mpartel avatar Feb 12 '23 15:02 mpartel

I think I now have pretty compelling evidence that this is a FUSE bug.

I modified FUSE's passthrough example's to do user mirroring by adding the line

stbuf->st_uid = fuse_get_context()->uid;

to xmp_getattr, and I modified your script to run passthrough instead of bindfs.

If I run the modified passthrough with -odefault_permissions (which bindfs must add automatically for correctness), it fails your test. Without that flag, or without that modification, it doesn't fail.

I'll package this up as a bug report to FUSE, but not sure yet if I can get that done today.

mpartel avatar Feb 12 '23 18:02 mpartel

FUSE devs don't want to treat this non-atomicity as a bug and told me to implement access instead. I'll try to find time to do that soon.

mpartel avatar Feb 17 '23 06:02 mpartel

I've been following along with interest. I tried running with FUSE 3 on the same machine (a Raspberry Pi 4) and continued to replicate the issue. For my use case I have resorted to other techniques to sharing files between users, but I'm very happy to test this issue when a solution is found.

Thanks, Geraint

G3zz avatar Mar 15 '23 14:03 G3zz

Thanks, I'm hopeful about finding the time in a few weeks. This probably just needs a few more hours of good concentration, but those have been in short supply :/

mpartel avatar Mar 15 '23 14:03 mpartel