cache Add update-env-variable to force/disable cache update.

This fixes https://github.com/actions/cache/issues/342

It's a more flexible version of #353 because it allows the CI to force or disable a cache update programmatically, instead of just setting it to a constant.

Jan 04 '21 05:01 eyal0

In short: Set an environment variable in your CI to force or disable the cache update.

Here's a longer description:

Problem

Say you want to cache a program called parallel so that you don't have to build it all the time. But you want to force a rebuild if the version that is online is different from the version that is in the cache. What you have to do is check what the latest version of the program is online, then put that number in the cache key, and then use the cache.

But, what if you have a lot of different programs with version numbers? You'd have to add them all to your cache key. So maybe your cache key will be: ubuntu-g++8-glib2-chrome46-nodejs15. And the order in which you put them in the cache key matters! For example, what if the latest version of chrome is now 47? You'll look for ubuntu-g++8-glib2-chrome47-nodejs15 and you won't get a cache hit. You could use the restore-key feature and match a prefix, like ubuntu-g++8-glib2-. But maybe that'll match some recent build that was on an older version of nodejs. Now you'll have to rebuild that, too.

The problem is that you're doing a cache per version of each dependency. You could have multiple caches, like one for the g++ in windows32 and another for the g++ in windows64 etc. And then the same for the version of chrome, nodejs, etc. You would have N*M caches, for every combination of platform and software.

What a hassle! What you really want is a cache per CI job. For example, one cache for "windows32", one for "windows64", one for "linux" and one for "macos".

Solution

What this PR provides is simple: Programmatically decide, after fetching the cache, if it should be overridden or not.

How it works:

Designate an environment variable name, such as UPDATE_CACHE.
Fetch your cache like usual.
During your CI steps, if you find that the cache needs to be overridden, export UPDATE_CACHE=true.
That's it!

If you don't set the env variable at all, the default behavior is maintained. And if you set it to false, then the cache update is disabled, even if it should have occured.

Here's a sample CI yaml file with comments:

    - name: Cache .local
      uses: eyal0/cache@update_env_variable
      with:
        path: ~/.local
        key: my_cache_key
        # Set the environment variable to watch to be named UPDATE_CACHE
        update-env-variable: "UPDATE_CACHE"
    - name: Install parallel
      run: |
        # Get the installed version number.
        MY_VERSION=$(parallel --version | head -1 | cut -d' ' -f3)
        # Download the webpage and get the latest version number.
        LATEST_VERSION=$(curl -fsSL https://mirrors.tripadvisor.com/gnu/parallel/ | hxnormalize | pup 'a[href]' 'text{}' | grep -v latest | cut -d- -f2 | cut -d. -f1 | sort -r | head -1)
        if [[ MY_VERSION != LATEST_VERSION ]]; then
          # There was a cache hit but the version in the cache is old.  Get the new one.
          wget -q -T5 -t1 -O "parallel-latest.tar.bz2" "http://mirrors.tripadvisor.com/gnu/parallel/parallel-latest.tar.bz2"
          mkdir parallel && pushd parallel && tar xjf "../parallel-latest.tar.bz2" && pushd parallel-* && ./configure --prefix=${LOCAL_INSTALL_PATH} && make && make install && popd && popd
          # *****FORCE A CACHE UPDATE, DESPITE THE CACHE HIT FROM BEFORE.*****
          echo "UPDATE_CACHE=true" >> $GITHUB_ENV

That's it. Even though there was a cache hit, the cache has stale data and needs updating. By setting the environment variable, the final step that used to say:

Cache hit occurred on the primary key my_cache_key, not saving cache.

will now instead say:

Cache saving was forced by setting UPDATE_CACHE to true.
/bin/tar --posix --use-compress-program zstd -T0 -cf cache.tzst -P -C /home/runner/work/user/repo --files-from manifest.txt
Cache saved successfully

And if you don't use the environment variable, the default behavior is maintained.

Jan 04 '21 07:01 eyal0

Maintainers, It's been three months without any feedback! This is a very requested feature. Please?

Mar 10 '21 15:03 ollydev

Unfortunately this does not work for me (anymore?). It fails with the following error: Unable to reserve cache with key build-ubuntu-ccache, another job may be creating this cache. I think the problem is already documented here: https://github.com/actions/toolkit/issues/505

Mar 15 '21 09:03 paresy

@paresy Okay, I have a solution for it now. It's a bit of a hack. The idea is to take advantage of the "key" and "restore-keys". What I do is append a random number to the key and use a restore-key without the random number. The key will never match but the restore-key will match the most recent good key. Then I set the update variable to false and only set it to the true if the cache needs rewriting. This will cause the cache to only be updated if the one that was fetched, which is the latest one, is not new enough.

You can see it demonstrated below. I create a random number, append it to the key that I'm looking for but also have a restore key without that random number.

    - name: Get a random number
      run: echo "RANDOM_SUFFIX=${RANDOM}${RANDOM}" >> $GITHUB_ENV
    - name: Cache local install path
      uses: eyal0/cache@main
      with:
        path: ${{ steps.sanitize-key.outputs.path }}
        key: ${{ steps.sanitize-key.outputs.key }}-${{ env.RANDOM_SUFFIX }}
        restore-keys: |
          ${{ steps.sanitize-key.outputs.key }}-
        update-env-variable: "UPDATE_CACHE"
    - name: Default don't update cache
      run: echo "UPDATE_CACHE=false" >> $GITHUB_ENV
    - name: Install valgrind
      run: |
        if ! valgrind --version || ! grep -qx "$(git ls-remote https://github.com/eyal0/valgrind.git master)" ~/.local/bin/valgrind.version; then
          git ls-remote https://github.com/eyal0/valgrind.git master > ~/.local/bin/valgrind.version
          echo "UPDATE_CACHE=true" >> $GITHUB_ENV
          <DO YOUR BUILD OF VALGRIND HERE>
        fi

You could use the output of date instead of a random number if you want, it doesn't matter, so long as you never collide. I would recommend against the github run_id because those get re-used if you re-run the CI.

You could do all this without my patch and it would work but then it would update every single time, which would unnecessarily thrash the cache and slow down your CI. It would increase the number of times that your CI needs to start over because of cache eviction. So this solution is still better than the default behavior.

Mar 25 '21 19:03 eyal0

@eyal0 Thanks! Unfortunately my use case is ccache for C++ which needs to be written every time. And i have several actions for different platforms. Due to the 5 GB limitation the cache is getting "cleaned" up for the other platforms if i do not run the actions equally. Therefore i really need to update the exact cache name :-)

Mar 26 '21 08:03 paresy

@paresy That will be impossible, I think, and it's a limitation of the server (on Microsoft Azure) which GitHub uses to store the cached files. That's why I suggested that maybe the cache server has a command to delete entries. That would be great!

We don't have details on the functioning of the server. Maybe the source code for the server is actually on GitHub somewhere and we could examine it? Some of the keywords that the cache server uses are: getCacheEntry and reserveCache and commitCache. So it's possible that there is a GitHub repo out there that includes those words that we could look at.

Here's the result of a search:

https://github.com/search?q=commitCache+getCacheEntry&type=code

Maybe something in one of those? I see clients but we need the server.

Mar 26 '21 16:03 eyal0

This is exactly what I'm looking for. Would be great to have this merged. 🙏

Apr 19 '21 17:04 ruudk

This is exactly what I'm looking for. Would be great to have this merged.

I see plenty of commits to this repo but no attention to this issue so you probably should not hold your breath! :-(

Apr 19 '21 19:04 eyal0

@dhadka I agree that, in light of being unable to overwrite existing cache entries, #489 is a better idea. But I have a concern with #489: It doesn't allow you to use two caches. CACHE_SKIP_SAVE is essentially a global variable. If you want to have multiple caches in a single job then you're in trouble. It would be possible to modify #489 to add a parameter for naming the environment variable, where the default is CACHE_SKIP_SAVE. So it would work just like the original for those that want the default environment variable name and for those that want a different name, for example in the case of two caches, they could have that, too. If I wrote that code, would you be interested in it?

Also, being unable to overwrite an existing cache means that neither this PR nor #489 address https://github.com/actions/cache/pull/498#issuecomment-808040535 . I think that the only solution to that would be to modify the server to allow overwriting caches, which I think would be a good feature anyway but I don't know if you are able to fix that.

Apr 20 '21 16:04 eyal0

@eyal0 Good point. We would need different env vars for each cache step.

My main goal is we have a few PRs all with similar requests:

Set env var to skip saving the cache - https://github.com/actions/cache/pull/489
Make cache read-only - https://github.com/actions/cache/pull/474
Configurable save on failure - https://github.com/actions/cache/issues/92
And this PR to add an env var to to force an update or disable update

So it would be 👍 if we can find some common denominator to avoid adding a bunch of similar or overlapping env vars / inputs for each special case.

Apr 21 '21 18:04 dhadka

Hmm...so something else we can consider. The inputs are re-evaluated on both the restore and save steps. For example:

      - uses: actions/cache@v2
        with:
          path: foo
          key: ${{ runner.os }}-foo
          skip-save: ${{ env.SKIP_SAVE }}
      - run: echo "SKIP_SAVE=true" >> $GITHUB_ENV

When the post step runs, the env var SKIP_SAVE is set to true and the input will get that value as well. Someone who wants a read-only cache can then just do:

      - uses: actions/cache@v2
        with:
          path: foo
          key: ${{ runner.os }}-foo
          skip-save: true

Proof of concept - https://github.com/dhadka/test-env/runs/2403284954?check_suite_focus=true

Apr 21 '21 18:04 dhadka

Yeah, that's interesting. I definitely won't have have expected that late execution. That makes it pretty easy to use different environment variables for each thing.

I don't really follow how the proof of concept that you sent correlates to the problem at hand, though. I think that the most convincing would be to have an example that uses two caches and to show how each one can have a different result: One saving, one not.

#489 is nice in that the code is really, really short but I think that it doesn't provide much output so it could be potentially difficult for users to see whether or not the cache saving was skipped and for what reason.

Anyway, assuming that the concept that you are proving works as expected, how do we incorporate that into something like #489?

Apr 21 '21 23:04 eyal0

I don't see how it'll work like you've suggested. Those input variables aren't available to the post option.

https://github.com/eyal0/test-cache/runs/2405477347?check_suite_focus=true

Nowhere to be found are skip-save, path, key, restore-keys. Perhaps they are only available to the code that is run by the post but not to the post-if. I looked in the entire github object and the env object.

Apr 22 '21 00:04 eyal0

@eyal0 Good point. We would need different env vars for each cache step.

My main goal is we have a few PRs all with similar requests:

Set env var to skip saving the cache - Allow to skip the save post-step #489

Make cache read-only - Add read-only feature #474

Configurable save on failure - Configurable save cache on failure #92

And this PR to add an env var to to force an update or disable update

So it would be 👍 if we can find some common denominator to avoid adding a bunch of similar or overlapping env vars/inputs for each special case.

Hi @dhadka!

I read all PRs you mentioned here and I the see the next solution(s) to all these PRs

1st one is temporary and it means adding one more input with the name post-action (just as an example) with the next values

on_miss - create cache only when we had any of 'key miss' during restore. this is the default;
on_explicit_miss - create cache when we had a miss by key but not by restore-keys
always - always create cache
never - never create cache

this should cover most of the cases people want to solve in their workflows

As for the 2nd one, I would call it more permanent, although it brings more flexibility and consequently breaking changes. And efforts of course.

If you read carefully what the author of https://github.com/actions/cache/pull/474 ended with is separate actions for restoring cache and saving it. From what I recall there is another issue with exactly the same suggestion.

this will allow

save cache step depending on other things that happen in flow after cache restore
to have a separate strategy for key in save 'cache step'
anything else?

Edit: add one more value to 1st solution

Aug 03 '22 09:08 ZuBB

Any updates on this ? I think everyone expects a Cache policy kinda thing which will help to skip post save and gitlab has this feature

https://docs.gitlab.com/ee/ci/yaml/index.html#cachepolicy

Nov 03 '22 00:11 asvny