Only garbage collect action cache entries
The cache should only garbage collect action cache entries based on last access time. CAS entries should have a reference count and only be collected if they are no longer referenced by any cache entry.
Can we make an assumption that one CAS entry is only referenced by one ac entry, so that when we delete ac entry, we just delete all the CAS entries referenced by that ac entry?
Can we make an assumption that one CAS entry is only referenced by one ac entry, so that when we delete ac entry, we just delete all the CAS entries referenced by that ac entry?
No that's generally not true. A blob is often referenced by thousands of action cache entries. I am currently implementing this feature and adding a reference count to each CAS entry. So we'll only evict a CAS entry when its no longer referenced.
Then when PUT AC/CAS to server, Is the following sequence always guaranteed to ensure the server correctness:
- All the CAS entries referred by a AC are PUT to server first
- Then the AC entry is PUT to server
Also, is it possible to let Bazel back off when it GET an AC entry returns 200 but the corresponding CAS is not found in the server? I found when this happens, bazel build may fail. The Bazel build requirement that CAS entry referred by AC entry always exist in the server causes two drawbacks:
- We have to keep track the reference count of CAS object for garbage collection
- For horizontal scale and load balancing, consistency among servers can be hard to maintain The ideal Bazel behavious is following: Every CAS entry can be tracked back to the corresponding AC that generate it. In case a CAS entry referred by an AC is missing, back track the AC entry and re-run the AC to regenerate the missed CAS entry. This immediately fix the two drawbacks mentioned above.
Also, is it possible to let Bazel back off when it GET an AC entry returns 200 but the corresponding CAS is not found in the server?
This is implemented in my gRPC WiP in #97, and I filed #106 for a backwards-compatible solution for http.
@bayareabear ^