zstd compression hanging on Windows with large files
The caching action always hangs when using actions/cache@master on Windows in my test runs using a random 150 MB file.
To dig in more, I created a separate repo that runs the compression commands. On a 1 MB file, it works fine:
2020-05-08T19:01:50.9215932Z bsdtar 3.3.2 - libarchive 3.3.2 zlib/1.2.5.f-ipp
2020-05-08T19:01:51.7976258Z Create random file
2020-05-08T19:01:51.8560894Z 1+0 records in
2020-05-08T19:01:51.8563149Z 1+0 records out
2020-05-08T19:01:51.8568671Z 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.011073 s, 94.7 MB/s
2020-05-08T19:01:51.8578782Z Running zstd standalone
2020-05-08T19:01:52.3317293Z
2020-05-08T19:01:52.3325676Z
2020-05-08T19:01:52.3331145Z Directory: D:\a\test-zstd\test-zstd
2020-05-08T19:01:52.3376822Z
2020-05-08T19:01:52.3392726Z Mode LastWriteTime Length Name
2020-05-08T19:01:52.3395053Z ---- ------------- ------ ----
2020-05-08T19:01:52.3412014Z -a--- 5/8/2020 7:01 PM 1048613 file.zstd
2020-05-08T19:01:52.3412805Z Running tar standalone
2020-05-08T19:01:52.3608654Z -a--- 5/8/2020 7:01 PM 1050112 file.tar
2020-05-08T19:01:52.3622919Z Running tar + zstd in two steps
2020-05-08T19:01:52.4062645Z -a--- 5/8/2020 7:01 PM 1050624 file_combined.tzst
2020-05-08T19:01:52.4064236Z Running tar + zstd using --use-compress-program
2020-05-08T19:01:52.4668289Z -a--- 5/8/2020 7:01 PM 1049135 file.tzst
2020-05-08T19:01:52.4670493Z Done
But on a 150 MB file, it hangs until it I cancel the action:
2020-05-08T19:03:52.6081814Z bsdtar 3.3.2 - libarchive 3.3.2 zlib/1.2.5.f-ipp
2020-05-08T19:03:53.4979676Z Create random file
2020-05-08T19:03:53.6897822Z 150+0 records in
2020-05-08T19:03:53.6909594Z 150+0 records out
2020-05-08T19:03:53.6912241Z 157286400 bytes (157 MB, 150 MiB) copied, 0.147127 s, 1.1 GB/s
2020-05-08T19:03:53.6925733Z Running zstd standalone
2020-05-08T19:03:54.3540113Z
2020-05-08T19:03:54.3584708Z
2020-05-08T19:03:54.3616876Z Directory: D:\a\test-zstd\test-zstd
2020-05-08T19:03:54.3634999Z
2020-05-08T19:03:54.3648596Z Mode LastWriteTime Length Name
2020-05-08T19:03:54.3650325Z ---- ------------- ------ ----
2020-05-08T19:03:54.3650520Z -a--- 5/8/2020 7:03 PM 157290014 file.zstd
2020-05-08T19:03:54.3650643Z Running tar standalone
2020-05-08T19:03:54.5655674Z -a--- 5/8/2020 7:03 PM 157287936 file.tar
2020-05-08T19:03:54.5658027Z Running tar + zstd in two steps
2020-05-08T19:03:55.0461307Z -a--- 5/8/2020 7:03 PM 157292032 file_combined.tzst
2020-05-08T19:03:55.0498037Z Running tar + zstd using --use-compress-program
2020-05-08T19:06:33.2001585Z ##[error]The operation was canceled.
Test repo - https://github.com/dhadka/test-zstd Example run that's hanging - https://github.com/dhadka/test-zstd/actions/runs/99414776
CC @imbsky
I'll also note that both bsdtar (3.3.2) and zstd (1.4.0) are over a year old.
As you say, the version of zstd we are using is old, and it's so different for each platform. So I recently thought it might be a good idea to use the same version using the tool-cache library for each platform. (This is't to say that there's a problem with the version of zstd we're currently using on Windows.)
I'm wondering if this is an old upstream issue. Maybe you guys know that. @Cyan4973 @felixhandte @terrelln
v1.4.0 is not that old, and I don't remember any issue like that. Random thoughts :
- What's the compression level used ?
- What's the time-out value ?
- Any storage capacity limit ?
@Cyan4973 I suspect it's an issue with bsdtar. I tested with a different compression tool (https://blog.kowalczyk.info/software/pigz-for-windows.html) and it also hangs on large files.
Gnu tar also works fine with zstd on Windows.
Turned off zstd for windows that are using bsdtar (windows-latest)
That’s fine.
I did a quick test with my workflow to add (a cygwin based, so a bit hacky) gnu tar to PATH so zstd is used on Windows.
- gzip saveCache: 84 seconds
- zstd saveCache: 27 seconds
So I think it's worth trying to fix this :)
The issue is reproducible with the latest release of libarchive/bsdtar v3.43 and zstd v1.4.5 on my machine. Here are some more verbose logs:
.\libarchive\bin\bsdtar.exe -vvv --use-compress-program "zstd/zstd.exe -vvvvvvv" -cf file.out file.in 2>&1 >> a.txt
.\libarchive\bin\bsdtar.exe : a -rw-rw-rw- 1 0 0 270347288 Feb 13 2018 file.in*** zstd command line interface
64-bits v1.4.5, by Yann Collet ***
At line:1 char:1
+ .\libarchive\bin\bsdtar.exe -vvv --use-compress-program "zstd/zstd.ex ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (a -rw-rw-rw- 1...Yann Collet ***:String) [], RemoteException
+ FullyQualifiedErrorId : NativeCommandError
FIO_createCResources
set nb workers = 1
FIO_compressFilename_srcFile: /*stdin*\
Using stdin for input
FIO_compressFilename_dstFile: opening dst: /*stdout*\
Using stdout for output
Sparse File Support is automatically disabled on stdout ; try --sparse
/*stdin*\: 4294967295 bytes
compression using zstd format
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
(L3) Buffered : 0 MB - Consumed : 0 MB - Compressed : 0 MB => 0.00%
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) =>
input pos(131072)<=(131072)size ; output generated 0 bytes
frea
d 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => input pos(131072)<=(131072)size ; output generated 0 bytes
fread 131072 bytes from source
ZSTD_compress_generic(end:0) => inp
ut pos(131072)<=(131072)size ; output generated 0 bytes
fread 13
I've filed a bug with libarchive: https://github.com/libarchive/libarchive/issues/1419
Thanks @lazka 🙂
The current work-around for this now causes disturbing cache-misses. :-(
I built some project on windows-latest and put the result to cache (imho artifacts are not suitable for this, as those results should not be shown after the workflow finished, it is just for inter-job file-usage)
Then I wanted to test on windows-2016 but there got a cache miss, due to different compression algorithms used and the cache file was not found.
Maybe a better work-around would be to call tar without compression and pipe the result into a standalone zstd call.
I will take a look at this tonight if I have enough spare time.
Any updates on this? :-)
Could you check this? @mmatuska
@smorimoto Since upstream has been dragging on this for many months now, maybe we should fork? Do you have a fork with working zstd compression on Windows?
This is a critical issue.
The current behavior is just a cache miss on windows-latest or self-hosted Windows unless zstd.exe is added to the machine.
It's also a cache miss on any self-runner that doesn't have zstd, which needs to go into the README. I set up self-hosted machines and had immediate cache failures and it took a lot of debugging to figure out why.
If this has to rely completely on zstd, then it needs to properly fail if zstd isn't installed.
@smorimoto Since upstream has been dragging on this for many months now, maybe we should fork? Do you have a fork with working zstd compression on Windows?
Directly using zstd could be an option too. https://github.com/facebook/zstd
Yes, but we already have the spec of prioritizing gnu tar and we can implement it as early as this month. (In fact, it's almost done, but I'm just a volunteer here, so there's nothing to be sure.)
The problem here is typically complicated by the existence of a user's custom runner. If zstd is not installed, we must handle it well because we are not allowed to fail a job on them. So it's a bit simpler to manipulate the arguments than to call zstd.
The problem here is typically complicated by the existence of a user's custom runner. If zstd is not installed, we must handle it well because we are not allowed to fail a job on them. So it's a bit simpler to manipulate the arguments than to call zstd.
Is there an API to detect custom runners? If so, we can move on with the solution for the majority of the users. Even if there is no such thing, it is very easy to test if a program is installed on a system.
import which from "which"
const zstdPath = await which('zstd', {nothrow: true})
if (zstdPath !== null) {
// use zstd
} else {
// fallback to other methods
}
This issue is stale because it has been open for 200 days with no activity. Leave a comment to avoid closing this issue in 5 days.
This issue is stale because it has been open for 200 days with no activity. Leave a comment to avoid closing this issue in 5 days.
ok
This issue is stale because it has been open for 200 days with no activity. Leave a comment to avoid closing this issue in 5 days.
We are still getting upto pace with the PRs and issues. About the PRs I can not give a timeline on when that will be done but rest assured, we are looking into triaging those in the coming months.
https://github.com/actions/cache/issues/745#issuecomment-1046598812
Sorry, but time doesn't fix things.
Hi @aminya @lazka @smorimoto we are tracking this issue in #984. Please check the current proposal there and provide comments/feedback if any. Closing this as duplicate.
👋🏼 @aminya @lazka @smorimoto We have released a new beta release which should fix this issue. Try tag: actions/[email protected]. Head over to discussion for feedback: https://github.com/actions/cache/discussions/1019