[Docker] PID 1 - Zombie reaping bug
There's a known bug with child process reaping in Docker containers. Here's an article that explains in some details what this issue is about:
http://blog.phusion.nl/2015/01/20/docker-and-the-pid-1-zombie-reaping-problem/
My question is if it would be worth to integrate a small init script in C or Python to fix this bug on the Docker images of this repo. ?
UPDATE: Here's the author's example init script in python used for his Docker base-image project:
https://github.com/phusion/baseimage-docker/blob/rel-0.9.16/image/bin/my_init
I think that article is at least partially wrong when it comes to node because node does reap child processes, at least the ones started through the child_process APIs.
The one instance where I can see it going awry is when grandchildren survive their parents, because then libuv unregisters the SIGCHLD handler. That's relatively straightforward to fix, though.
Would that make it more of a libuv issue rather than a Node.js / IO.js issue ?
In that case, I could maybe post it there ?
Both, probably. io.js should install a SIGCHLD handler and libuv needs to save and restore it. How about we start with libuv? :-)
Great! I'll cross-post the issue there and look at the code this evening.
@bnoordhuis I'm the author of the article. I never disputed that Node.js reaps its own processes, nor did I dispute that all other programs reap their own processes. What the article states is that programs generally do not reap processes that they adopt indirectly through child processes, because outside of Docker it generally does not happen. And this is fine: normal apps should be able to count on an init process that takes care of this, instead of worrying about whether they are themselves the init process.
This is not a bug in Node.js/IO.js or in libuv, and in my opinion you shouldn't implement any solutions for this issue inside Node.js/IO.js/libuv. Because if you do, you will run into problems with libraries that spawn processes. For example, suppose that the developer uses some C library that spawns a process and then waitpid()s on it a few seconds later. If Node.js/IO.js/libuv has a general SIGCHLD handler that reaps all child processes, then that C library will be unable to waitpid() on its own child to collect the exit status.
In my opinion you should just take our my_init script and be done with it.
Because if you do, you will run into problems with libraries that spawn processes. For example, suppose that the developer uses some C library that spawns a process and then waitpid()s on it a few seconds later. If Node.js/IO.js/libuv has a general SIGCHLD handler that reaps all child processes, then that C library will be unable to waitpid() on its own child to collect the exit status.
@FooBarWidget I don't disagree with your point but it's already impossible for add-ons to do so (reliably anyway) due to the SIGCHLD handler that is installed when the child_process API is used. Signal dispositions are effectively owned by io.js and libuv.
In my opinion you should just take our my_init script and be done with it.
I don't disagree with that either (particularly if it means less work for me!), except that your scripts pulls in a python dependency when a basic init is < 50 lines of C. That seems rather unfortunate, especially for the minimal image.
The Python dependency pulls in maybe 20 MB of dependencies on disk, 5 MB in RAM? Sounds like a good price to me if it means you don't have to put more work in it. But I'll leave that to you to decide.
Isn't the behaviour of node and libuv largely irrelevant here? The containers in this repo are intended to be used with versions of node which by definition may not be working properly.
@FooBarWidget python is already a build requirement, so the storage overhead should be well under 20MB ;-)