Scratch.jl icon indicating copy to clipboard operation
Scratch.jl copied to clipboard

Guidance on avoiding race conditions when multiple jobs try to access and (re)generate the same scratch space

Open fingolfin opened this issue 4 years ago • 2 comments

Motivated by https://github.com/oscar-system/Polymake.jl/issues/381 :

A user may do @everywhere using PACKAGENAME on a package which uses scratch spaces. if the scratch space is missing/outdated, then each of the parallel jobs may try to regenerate it, creating a mess.

I think it would be good to warn about this situation in the Scratch.jl documentation, and perhaps also provide some guidance on how to deal with that.

fingolfin avatar Jan 07 '22 11:01 fingolfin

https://github.com/vtjnash/Pidfile.jl maybe?

fredrikekre avatar Jan 07 '22 11:01 fredrikekre

Perhaps? The documentation of Pidfile.jl is a bit sparse... I assume one ought to use one of its API functions to create a lock file, then create the scratch space and finally release the lock? An example would be useful. I also wonder how well it e.g. works on NFS volumes (still see those in labs for home dirs), and whether it works on Windows.

fingolfin avatar Jan 11 '22 01:01 fingolfin