coreutils icon indicating copy to clipboard operation
coreutils copied to clipboard

Teach cp to reflink on Windows

Open barcharcraz opened this issue 2 years ago • 8 comments

add windows version of copy_on_write for cp command

This is implemented using DeviceIoControl FSCTL_DUPLICATE_EXTENTS_TO_FILE, and thus only works on ReFS (and possibly winbtrfs, I'm not sure). While it should be possible to handle sparse modes on windows as well this PR doesn't do that. Note that we do set the destination file as sparse while copying it, this is because the target of FSCTL_DUPLICATE_EXTENTS_TO_FILE must be at least as large as the source, and we don't want to write out zeros to the disk before copying over the extents.

I tried to do similar things to what linux does here, but things are a bit more complicated on windows, and I ended up just handling fallback modes inline instead of via an enum parameter to duplicate_extents.

I don't actually check for FILE_SUPPORTS_BLOCK_REFCOUNTING, if it doesn't the call to DeviceIoControl will just fail and you'll get the fallback behavior (so an error with --reflink=always and a "real" copy with --reflink=auto.

barcharcraz avatar Jun 22 '23 02:06 barcharcraz

nice, did you see performance improvements?

sylvestre avatar Jun 22 '23 06:06 sylvestre

nice, did you see performance improvements?

Yes, assuming reflinking actually happened. I tried copying some video files between folders on a ReFS formatted drive and indeed things did speed up, a 5gb file took a bit longer than a minute (which is what I would expect for this drive, it's spinning disks and I had several background jobs ongoing so I'd expect transfer speeds of like 30-50MB/s. When reflinking was enabled things were mostly instant, although in some cases there was 10-30sec of waiting around for the disk, again, this is about what you'd expect.

Note that I didn't change the default for the --reflink option on windows, so it won't happen at all by default. I figured since ReFS is so rare and it's performance characteristics are somewhat unknown the default should stay "never", at least for a while (I think GNU CP on linux only started defaulting to auto after the --reflink option had existed for a while)

barcharcraz avatar Jun 22 '23 18:06 barcharcraz

GNU testsuite comparison:

Skip an intermittent issue tests/misc/timeout

github-actions[bot] avatar Jun 23 '23 06:06 github-actions[bot]

it needs some rustfmt ;)

sylvestre avatar Jun 25 '23 19:06 sylvestre

it needs some rustfmt ;)

Just done, sorry that took so darn long

barcharcraz avatar Jul 13 '23 03:07 barcharcraz

it needs some rustfmt ;)

Just done, sorry that took so darn long

No worries, 3 weeks is very reasonable :)

sylvestre avatar Jul 13 '23 06:07 sylvestre

the coverage is complaining a bunch about windows.rs do you know why we aren't covering more lines of code ?

sylvestre avatar Jul 13 '23 07:07 sylvestre

the coverage is complaining a bunch about windows.rs do you know why we aren't covering more lines of code ?

It looks like that typo fix missed a usage and the whole windows build is now failing.

Coverage-wise I suspect that it's because I did not change the default reflinking mode on windows, I figured most people are probably using NTFS, which doesn't support reflinks anyways, and I haven't done a ton of testing for the failure path on different filesystems or different versions of windows. In any event this means it's likely the tests are not triggering reflinking at all on windows.

Also, even if we added a test that does use --reflink=auto on windows I'm not sure how to get it to run in an environment that actually uses a filesystem with reflink support. ReFS filesystems can't be created on most versions of windows, and booting off one is rather experimental. ZFS for windows and winbtrfs are both somewhat unstable (and winbtrfs is pretty slow) and we wouldn't want to bluescreen the CI machines.

I think if we really wanted to add good test coverage we'd have to prepare a ReFS formatted VHD disk and somehow bake that into the image, mounting it as a "loopback" disk before running the tests. This could be useful in general but is a bit of a big CI change, since it needs administrator permissions. We'd also need to make sure the content of the VHD resulting from the tests is something we can examine.

barcharcraz avatar Jul 14 '23 20:07 barcharcraz