process Use pidfd to track processes on Linux >= 5.4

Background:

https://lwn.net/Articles/789023/
https://lwn.net/Articles/794707/

The problem Pids in ProcessHandles is that on most system they are limited to 2^16 many by default. If you spawn many short-lived processes quickly, the pid namespace can wrap around and you will accidentally waitForProcess (wait()) on the wrong process, or terminateProcess the wrong one.

Linux 5.4 solves this with pidfds (which are per-process, and 32-bit many). They can point to zombie processes so they will never accidentally point to a different process.

The process library could use them on newer Linux by simply tracking the pidfd in a Maybe field inside ProcessHandle.

After being spawned, a pid can be converted to a pidfd using pidfd_open() -- but this is still slightly racy, and better is to get it atomically directly from clone(). But it's an easy migration path that's a strict improvement already.
pidfds can be waited on with select(), epoll() and so on, which means we can use the GHC IO manager to wait for them more efficiently than with the usual waitForProcess.

This Rust library https://github.com/pop-os/pidfd shows how you can wait for a program to finish using waitid().

Aug 12 '20 03:08 nh2

Linux 5.10 made pidfds nonblocking, thus making them much easier to use without additional threads, and probably direct integration into the IO manager:

https://news.ycombinator.com/item?id=25413266

Dec 15 '20 13:12 nh2

glibc 2.36 now contains wrappers for the pidfd_* functions: https://www.phoronix.com/news/GNU-C-Library-Glibc-2.36

Aug 02 '22 13:08 nh2