Use pidfd to track processes on Linux >= 5.4
Background:
- https://lwn.net/Articles/789023/
- https://lwn.net/Articles/794707/
The problem Pids in ProcessHandles is that on most system they are limited to 2^16 many by default. If you spawn many short-lived processes quickly, the pid namespace can wrap around and you will accidentally waitForProcess (wait()) on the wrong process, or terminateProcess the wrong one.
Linux 5.4 solves this with pidfds (which are per-process, and 32-bit many). They can point to zombie processes so they will never accidentally point to a different process.
The process library could use them on newer Linux by simply tracking the pidfd in a Maybe field inside ProcessHandle.
- After being spawned, a pid can be converted to a
pidfdusingpidfd_open()-- but this is still slightly racy, and better is to get it atomically directly fromclone(). But it's an easy migration path that's a strict improvement already. - pidfds can be waited on with
select(),epoll()and so on, which means we can use the GHC IO manager to wait for them more efficiently than with the usualwaitForProcess.
This Rust library https://github.com/pop-os/pidfd shows how you can wait for a program to finish using waitid().
Linux 5.10 made pidfds nonblocking, thus making them much easier to use without additional threads, and probably direct integration into the IO manager:
https://news.ycombinator.com/item?id=25413266
glibc 2.36 now contains wrappers for the pidfd_* functions: https://www.phoronix.com/news/GNU-C-Library-Glibc-2.36