ccache Improve cache cleanup mechanism

Background

The cache cleanup mechanism has worked essentially the same ever since ccache was initially created in 2002:

The total number and size of all files in one of the 16 subdirectories (AKA level 1) is kept in the stats file in said subdirectory.
On a cache miss, the new compilation result file is written (based on the first digit of the hash) to one of those 16 subdirectories and the stats file is updated accordingly.
Automatic cleanup is triggered if the size of the subdirectory becomes larger than max_size / 16.
ccache then lists all files in the subdirectory recursively, stats them to check their size and mtime, sorts the file list on mtime and deletes the 20% (by default; see config option limit_multiple) oldest files.

Problems

Some problems with the current cleanup mechanism:

(A) If several concurrent ccache invocations result in a cache miss and write their result to the same subdirectory then all of them will start cleaning up the same subdirectory simultaneously.
(B) The ccache invocation that resulted in a cache miss will perform cleanup and then exit, which means that an arbitrary ccache process that happens to trigger cleanup will take a long time to finish.
(C) Listing all files in a subdirectory of a large cache can be quite slow.
(D) stat-ing all files in a subdirectory of a large cache can be quite slow.
(E) Deleting many files can be quite slow.
(F) Since a cleanup by default removes 20% of the files in a subdirectory, the actual cache size will (once the cache limit is reached) on average hover around 90% of the configured maximum size, which can be confusing.

Ideas

Here are some ideas on how to improve the cleanup mechanism while staying reasonably backward compatible with older ccache versions:

Locking: When starting cleanup, acquire a lock on the level 1 and release it when cleanup is finished. On a cache miss, check if cleanup is already in progress. If so, don't store the result in the cache and increment the statistics counter "cleanup in progress".
- This addresses problem A.
- Tricky part: When to break the lock due to a killed or hung cleanup process.
Cleanup in the background: When cleanup is triggered, fork a process that will do cleanup in the background and exit early to the caller.
- This addresses problem B, and to some extent C, D and E since it will be less of a problem that they take a long time.
Randomized eviction: Start by retrieving a list of all files. Then choose, say, 10 random file entries and stat them to get size and mtime. Delete the oldest of the 10 files. If the cache size is still above the limit, stat 10 more files and delete the oldest of the 19 files, and so on.
- This addresses problem D since only 10% of the files will be stat-ed.
- Major disadvantage: Automatic cleanup will no longer be able to correct incorrect "actual size" counters.
- Minor disadvantage: Cleanup might delete a bit too new entries, but on average this won't be a problem over time.
Check all level 1 stats files: On a cache miss, check all level 1 stats files to get an accurate view of whether cleanup should be initiated.
- This addresses problem F.
- Disadvantage: May trigger multiple cleanups if there are more cache misses for other subdirectories before the subdirectory is done cleaning. An alternative would be to clean the largest subdirectory instead of the current, but then it becomes tricky to return with "cleanup in progress" without introducing a global.
Optionally clean on level 2: If the subdirectory is large, choose a random level 2 subdirectory to clean.
- This addresses problem C, D and E.
- Major disadvantage: Automatic cleanup will no longer be able to fix incorrect "actual number of files" and "actual size" counters.
Increase limit_multiple: If other measures make cleanup faster or less of a problem we could increase limit_multiple from 0.8 to, say, 0.9 or 0.95.
- This addresses problem F.

May 06 '19 19:05 jrosdahl

It should be noted that some of the alternative storage methods, such as memcached, has built-in cleanup. This means you can continue to add to the cache, and the "oldest" entries will be dropped automatically...

https://en.wikipedia.org/wiki/Cache_replacement_policies#Least_recently_used_(LRU)

There are some other interesting cache replacement policies, with one of the more interesting being called the Adaptive Replacement Cache (ARC), or perhaps some variation on LRU like 2Q (variant of LRU/k).

https://en.wikipedia.org/wiki/Adaptive_replacement_cache

May 07 '19 17:05 afbjorklund

I'm not really sure what to make of your comment... Is it to add some general "by the way, interesting to know" info, or do you have ideas that are applicable for improving the current filebased storage?

May 07 '19 20:05 jrosdahl

@jrosdahl : waiting for some more information on what proposal you had intended with this issue... the information was mostly "interesting to know", even if it does apply to the new storage backends

May 08 '19 17:05 afbjorklund

First of all, I think the "Caveats" section in the documentation should mention the disk performance problem and long, unexpected pauses that the automatic clean-up introduces.

I got hit by this not that long ago, and it took me a while to realise that ccache's automatic cleaning was the cause. At first, I even thought that my disk was failing, or that my Linux system was thrashing.

This has been known for a long time, and this very same GitHub issue describes it: "an arbitrary ccache process that happens to trigger cleanup will take a long time to finish". The disk thrashing aspect does impact the overall system performance, because you do not expect a ccache invocation to hammer the disk for a long time.

There is no reason to let other users waste time on this matter like I did. If the "Caveats" section mentions it, you know what you are up to.

If so, don't store the result in the cache and increment the statistics counter "cleanup in progress".

In this scenario, maybe overshooting the limit is not that bad.

If overshooting the limit is to be avoided at all costs, I wouldn't choose to stop storing the result in the cache. That would be unexpected. I would rather wait on the lock.

Tricky part: When to break the lock due to a killed or hung cleanup process.

On Linux, flock automatically releases the lock when the file descriptor is closed. That should include the cases where the process is killed. If a ccache process hangs, then you have bigger problems anyway. 8-)

The presence of lock files that have been left behind could serve as an indication that something is wrong. Those lock files could contain the PID of the creating process (as text), in order to help troubleshooting.

Cleanup in the background: When cleanup is triggered, fork a process that will do cleanup in the background and exit early to the caller.

This is highly problematic. It could prevent the user from unmounting a partition, because some invisible processes are running in the background and are still using the mount. I actually hit that problem recently, so I wrote a script to wait until a process and all of its children have finished (that is not as straightforward as you may think):

https://github.com/rdiez/Tools/tree/master/RunAndWaitForAllChildren

Some commands like systemd-run may also wait for all children, so the user may be surprised that a compilation has apparently finished but is still waiting to complete. I am not sure, but systemd-run itself may kill the complete cgroup after a timeout when the first process finishes.

Randomized eviction

I do not think that's a good idea. And I would like to see mathematical proof that the statement "on average this won't be a problem over time" is true. 8-)

stat-ing all files in a subdirectory of a large cache can be quite slow.

Some years ago, there was a discussion to provide alternative syscall named 'xstat':

https://lwn.net/Articles/394298/

Apparently, there were some flags to prevent the Kernel from always returning all available information, which can boost performance. Instead of 'xstat', my Kernel has a 'statx' syscall which such flags. I wonder if that is the way to go.

An alternative missing in the list of options is to have some global daemon dedicated to cache clean-up, but that would be a radical departure. I wonder whether a global process could speed up the general cache management. At the moment, each ccache instance must start from scratch by parsing the cache configuration file.

Jun 15 '22 09:06 rdiez

First of all, [...] should [...] This has been known for a long time [...] There is no reason to let other users waste time on this matter like I did.

To me, it sounds like you are telling us between the lines that we haven't done our job. I don't appreciate this attitude.

I would appreciate it if you stop writing these entitled "this should have [...]" and "there's no reason to [...]" type comments. They are tiresome and demotivating, and you're not helping. If you're interested in getting and contributing to better free software, I suggest asking yourself "how can I make people happy and motivated to work on software I'm interested in?".

Regarding how long it has been known, I'm fairly sure that it has been known since the beginning (in 2002 when Tridge implemented it) that the cleanup mechanism is naive and not optimal. But in practice, the simplistic approach seems to have worked good enough for many people. It has for me, using 500 GB caches on rotating disks on servers.

If so, don't store the result in the cache and increment the statistics counter "cleanup in progress".

In this scenario, maybe overshooting the limit is not that bad.

It's not that overshooting the limit is very bad, it's that the cleanup process will not be able count the files correctly if another process sticks a new file into the cache during the process. That's something I see would be good to fix when reworking how cleanup is performed.

Tricky part: When to break the lock due to a killed or hung cleanup process.

On Linux, flock automatically releases the lock when the file descriptor is closed. That should include the cases where the process is killed. If a ccache process hangs, then you have bigger problems anyway. 8-)

That would work fine for local file systems, but ccache intentionally doesn't use BSD (flock) or POSIX (lockf/fcntl) locks since having the cache on NFS is a supported use case and file locking is known to not work well on some NFS setups.

If a ccache process performing maintenance is stopped by the user, for instance if the user pauses the build with SIGTSTP (Ctrl+Z), then I think that other running ccache processes should not be blocked indefinitely. That's why I would prefer breaking the lock after a timeout if the lock holder isn't active.

Cleanup in the background: When cleanup is triggered, fork a process that will do cleanup in the background and exit early to the caller.

This is highly problematic.

If this is ever implemented it will be configurable. By the way, did you know that Git also does this occasionally after a git commit, even by default?

Randomized eviction

I do not think that's a good idea. And I would like to see mathematical proof that the statement "on average this won't be a problem over time" is true. 8-)

Because otherwise you won't approve it?

Approximated LRU by randomly sampling is not a novel or crazy idea. For instance, Redis's default "allkeys-lru" mode uses a similar method. It's "just cache" and being very precise about what to evict is not that important. The current naive cleanup mechanism already isn't precise since it will evict entries from the chosen subdirectory that are newer than older entries in other subdirectories.

Jun 26 '22 09:06 jrosdahl