Add skip-save feature
The skip-save option allows users to force the cache to skip saving the cache. skip-save can be set to a constant or to an expression that will be evaluated at the end of the CI job, such as an environment variable.
This is for #498
@dhadka Please take a look at this. It implements the "skip-save" idea that you had.
Testing is here: https://github.com/eyal0/test-cache/runs/2406514627?check_suite_focus=true
If you look at the log of the CI, you'll see that SKIP_FOR_FOO was set to true so only the foo cache is skipped. The bar cache is not skipped. This is apparent later in the log where it says:
Cache saved with key: bar-560227097
and
Cache saving was disabled by setting skip-save.
This should fix up a lot of the issues that were filed. One thing that it won't fix is forcing a save even when CI fails. That was the request in #92. It had 72 users that want this feature so, you know, kind of popular! What to do about this? My #498 could partially solve this because it has the "always/never" mechanic where you can specify that you want to force a save, even if there is a failure! We would just remove the post-if and check for success inside the code.
What are your thoughts? Seems like it would be important to get this right the first time, rather than modify the API to caching many more times!
Maybe skip-save should have no default value and the code can detect this and perform the default behavior. And when skip-save is false then it will always save, like was wanted in #92? And when true, then it'll skip the save, like the other issues requested.
That makes skip-save a tri-state variable, though: true, false, and default. What's worse, it has a sort of negative logic where true means don't save and false means "do save. So true means no and false means yes. :disappointed: This is going to be confusing for people, I'm certain.
Anyway, we could leave it as-is and have the default for skip-save be failure(). Then saving would be skipped when there is a failure, just like in the post-if. And then remove the post-if. That would fix #92 because the requesters of that feature could just set the skip-save to false and then it would save even if the job fails, which is what they wanted. Default behavior is unchanged. But then skip-save false is more like a "force save", which isn't intuitive from the name in my opinion. Maybe better to call it, like, should-save or save-condition and let the default be success(). Users that never want to save put false, always want to save (except for primary key hit) put true, and the default is the current behavior. Of course, users can use an environment variable in there and control the true/false during the job if they want.
What do you think is the best solution?
@eyal0 Thanks for putting this together! I'm out this week but CC'ing @konradpabjan for review.
I vote for save-condition it covers both of the most requested features: skip save & save even if failed.
Providing it can be controlled during the job (like you said) this is great. 👍
This fixes #272
- What's up with
package-lock.jsonhaving so many changes?
It's some npm thing I guess? I don't know much about node.js package management. I think that when I typed npm install to get all the dependencies, it might have updated that. Maybe there is someone that knows something about npm that can help us out? I just barely even know javascript. :-)
Documentation is a good idea, for sure. I'll do that.
What do you think about https://github.com/actions/cache/pull/571#issuecomment-824521755 and #92? this is a place where it probably doesn't make sense to have an incremental solution. We don't want to release a new parameter, modifying the API, and then have to do it again! Better to get it right the first time. The concerns in #92 seem totally reasonable and this PR will definitely not address them.
I think that a better way to think about this PR is not just how to fix the problem but rather, to think about: How do we want the API to behave?
What's up with package-lock.json having so many changes?
I think this is because package.json specifies the versions with ^, such as ^1.0.5, which I believe means it will update to the latest version of that dependency (excluding a major version [breaking] change) whenever npm update or npm install is run.
What's up with package-lock.json having so many changes?
I think this is because
package.jsonspecifies the versions with^, such as^1.0.5, which I believe means it will update to the latest version of that dependency (excluding a major version [breaking] change) whenevernpm updateornpm installis run.
It looks like eyal0 is using NPM 7, which has upgraded the package-lock.json to "lockfileVersion: 2" format. The installed package versions are the same as the ones in the current lock file.
The PR doesn't modify the package.json, so it should be safe to discard the changes to package-lock.json.
I think that I've addressed all comments. In order to keep the commit history, I haven't rebased. If you want, rebase and squash.
(I still worry about #92 . That is a reasonable request and we haven't left room to fix it here.)
I think that a better way to think about this PR is not just how to fix the problem but rather, to think about: How do we want the API to behave?
Just thinking of options to unify this and #92...
If we offered separate save and restore steps, we could essentially move the post-if logic, which is defined by the Action, to the if: condition in a user's workflow. For example:
Always save cache:
- uses: actions/cache-restore@v2
path: foo/
key: ${{ runner.os }}-foo-${{ hashFiles('**/package-lock.json') }}
- ...
- uses: actions/cache-save@v2
path: foo/
key: ${{ runner.os }}-foo-${{ hashFiles('**/package-lock.json') }}
if: always()
Skip saving if env var set (or any other valid expression):
- uses: actions/cache-restore@v2
path: foo/
key: ${{ runner.os }}-foo-${{ hashFiles('**/package-lock.json') }}
- ...
- uses: actions/cache-save@v2
path: foo/
key: ${{ runner.os }}-foo-${{ hashFiles('**/package-lock.json') }}
if: env.SKIP_SAVE != 'true'
Other useful scenarios I can think of are:
- Saving the cache earlier in a workflow
- Skip saving on certain branches (to avoid going over the 5 GB quota)
@eyal0 thoughts on this approach?
(We'd of course continue to offer the current action in its current form for those that just need the default behavior. This approach has some downsides like needing to specify the path / key in both steps, maintaining multiple actions, more burden on users to figure out the correct setup, etc. There are also some implementation details we'd need to think about...the Action currently passes some info from the restore to the save step using state variables, and we'd need to see if and how that's affected when the Action is split into separate steps.)
The recent surge in misuse of actions by malicious third parties has highlighted the importance of a security first approach to actions, as has been described by the GitHub Security Lab in Keeping your GitHub Actions and workflows secure: Untrusted input:
When writing custom GitHub Actions and workflows, consider that your code will often run with repo write privileges on potentially untrusted input. Keep in mind that not all GitHub event context data can be trusted equally. By adopting the same defensive programming posture you would employ for any other privileged application code you can ensure that your GitHub workflows stay as secure as the actual projects they service.
The simple and convenient read + write nature of actions/cache is great for adding caching with ease, but it encourages people -- myself included -- to add cache writing without considering the potential for misuse. Although it's increasing the burden during implementation, I agree with @dhadka that the ideal design for this action is save as an opt-in decision (default to not save) and I'd also say that any examples should be updated to provide a clear indication that writing to the cache should only happen from trusted branches (e.g: pipelines on main can write cache, pipelines on Pull Requests should not).
@dhadka Small edit to your first example to add multiple restore keys:
- uses: actions/cache-restore@v2
path: foo/
restore-keys: |
${{ runner.os }}-foo-${{ hashFiles('**/package-lock.json') }}
${{ runner.os }}-foo-
- ...
- uses: actions/cache-save@v2
path: foo/
key: ${{ runner.os }}-foo-${{ hashFiles('**/package-lock.json') }}
if: always()
I think splitting the action up would make the key configuration clearer. The restore action would have a simple list of keys to try in order, while the save action only has a single key.
Perhaps actions/cache-restore would need to output the matched key so that actions/cache-save could be skipped when the key didn't change.
I dunno...
For me, I just need somewhere to store and load files, you know? If the action cache were just an FTP server where I could upload and download files and it had a some quota on how many files each user project could store, that'd be fine. All this discussion about how to skip saving or force saving or the inability to overweight files, it's all just users that want to get basic stuff that a file server could do but access is difficult because actions/cache is an API to a file server that is very opinionated about how things should be stored, with primary keys and restore keys and no overwriting and the like.
So if the interface is going to change, just make the action behave like an FTP server where every project gets a little storage. FTP is already well understood and tested. And if people have a strong opinion about how to implement caching on top of an FTP server, cool. Let them right their own actions on top of it.
Anyway, my two cents. I get that there might be all sorts of business reasons why GitHub didn't just say: "Hey, here's an FTP server that you can use for you project, up to 5GB. Go ahead and build your own stuff on top of it, like a cache or whatever." Maybe there's a good reason why it isn't like that.
Hey, we are looking for this PR to be merged
Any update on this pull request?
I am not familiar with typescript, but I would be happy to help.
Would love to see this feature enabled!
Any update on this pull request? I am not familiar with typescript, but I would be happy to help. Would love to see this feature enabled!
Nothing changed, I had to fork and made development for myself
Related to https://github.com/actions/cache/issues/795 @aparna-ravindra @Noste @yadigarbz @eyal0 @kotewar Is there anything I could do to help this getting merged? I'd happy to contribute in any way I can so I don't have to keep a fork for us 😅
I prefer the other way with the update-env-variable so I'll close this one and instead work on this: https://github.com/actions/cache/pull/498
I think that it's better.
Hey all, 👋🏽
We have created a discussion with a proposal that will solve this problem, do let us know your feedback, thanks 😊