cache: better locking
- Use an actual file lock instead of file presence
- Do not wait for 1 second in case of lock conflict (reduced to 0.05s)
- actual locking code is from https://github.com/benediktschmitt/py-filelock (public domain code); we can also add the package as a dependency or just extract the relevant code if you prefer.
Background: this code is part of the reason for the CI slowdowns in mirgecom (especially with profiling).
Thanks! I think it's probably better to add this as a dependency instead. (https://github.com/conda-forge/filelock-feedstock also already exists.)
The main reason I didn't use the package is that https://github.com/benediktschmitt/py-filelock/blob/master/filelock.py logs a lot of debug messages.
Out of curiosity: What prompted this change? Were their any specific failure modes?
https://github.com/illinois-ceesd/mirgecom/pull/418 fixed the immediate issue we had seen in mirgecom, so this PR isn't a priority at the moment. However, I still think a 1 sec timeout is a bit much, maybe we should reduce it to 0.05s (like in the filelock package)?
I'm open to reducing the timeout.