cms icon indicating copy to clipboard operation
cms copied to clipboard

Introduce a better interface instead of `process_cmdline`

Open giomasce opened this issue 12 years ago • 3 comments

ResourceService checks the state of launched services and servers by comparing all running processes' command lines against the process_cmdline provided in the configuration. This is not a very nice interface, also because it requires user to adapt to their case what in theory is a feature of the system that they shouldn't control directly.

I propose to use the more classical solution of writing a pidfile when a process starts. RS can check the processes indicated in these pidfiles.

giomasce avatar Apr 10 '13 19:04 giomasce

I have a different proposal, that also involves how RS starts other services.

My idea is to have RS use multiprocessing to start the other services. This allows us to give it directly the module, class and function name of the code we want to execute, rather than starting a command. This should also give us a definitive fix for #79 (I'm sure I already wrote this proposal somewhere... but I can't find the place). The API provided by the module looks also good for monitoring the started processes, detect when they fail, investigate why, etc. Yet, since I don't know the API provided by subprocess (the module we use now), I'm not sure if it's better than what we already have.

Inside this scenario, my proposal for auto-detecting running services is to have RS try to acquire the socket the service will be listening on. If this operation fails then RS deduces that someone is already listening (with high probability, the correct service) and does nothing. If not, RS starts the service using multiprocess and giving it the socket it just acquired to listen on.

In my eyes this approach looks cleaner than both process_cmdline and PID files because it requires no configuration, it doesn't save anything on the filesystem (and, again, doesn't require configuration for where to put these files) and doesn't risk to "lock" the user in case of a crash (even if I expect there are techniques to avoid that with PID files too).

Another advantage is that, as soon as a service dies, the socket will instantly be available for RS to give it to the new spawn. At the moment, due to a bug (?) in how we handle sockets, it may take some time for the kernel to make the socket available again after a service dies, causing many subsequent restart-tries to fail instantly. Me and @gcampax experienced this at our latest training camp and perhaps he can give further details on what's happening.

From what I heard on the web this approach is rather common in the Java world.

lw avatar Apr 10 '13 19:04 lw

I don't dislike your proposal.

BTW, for what I understand, multiprocessing is better than threading, because it creates a wholly new process (instead of a thread), which avoid all the problems of sharing the Global Interpreter Lock. But I have to admit my ideas are not completely clear yet.

giomasce avatar Apr 10 '13 20:04 giomasce

https://github.com/cms-dev/cms/commit/2c720ae1136e911733e38a28889837ee98ccea16 removed the need of process_cmdline but I still think the whole approach needs to be changed.

lw avatar Jan 02 '14 11:01 lw