simpleflow icon indicating copy to clipboard operation
simpleflow copied to clipboard

Windows process support

Open benstroud opened this issue 8 years ago • 6 comments

Part of the attractiveness of SWF is the ability to deploy workflows anywhere. Support for Windows would be useful. In general, I see no reason that simpleflow couldn't be deployed to Windows hosts, but I'm encountered the following when attempting to deploy simpleflow workflow/decider long polling on a Windows system with the simpleflow decider.start and simpleflow worker.start command line invocations:

signal.SIGCHLD is not currently supported on Windows.

I looked to Cygwin as a workaround (where I believe SIGCHLD is supported). However the psutil dependency does not work for Cygwin currently (https://github.com/giampaolo/psutil/issues/82).

Commenting out the signal.SIGCHLD usage in simpleflow.process.supervisor.py results in additional Windows-specific issues with the multiprocessing module (pickling differences described here: https://docs.python.org/2/library/multiprocessing.html#windows).

benstroud avatar Mar 20 '17 18:03 benstroud

Yes, simpleflow would be nice to have on non-Posix platforms... Using Cygwin may be a solution when psutil supports it; however, it's quite heavy itself. I think I'd try to abstract Supervisor (and random dependencies, such as simpleflow.process.named_mixin.NamedMixin#set_process_name) away from the current Posix-based implementation instead; what's your opinion?

ybastide avatar Mar 20 '17 20:03 ybastide

I'm justing starting to dig in, but perhaps instead of detecting lost child processes via SIGCHLD signal there could be a periodic polling (thread) of the number of children using psutil? There would be a potential for some delay in detecting a lost child, but it would be more portable. Since the commands are designed to be long running, I think the tradeoff would be acceptable.

benstroud avatar Mar 20 '17 20:03 benstroud

Yes, that's a possibility I guess.

Or maybe we could go higher-level and use multiprocessing.Pool and such (waving hands)?

I mean, our needs should be covered by the stdlib already or nearly so :slightly_smiling_face:

ybastide avatar Mar 20 '17 21:03 ybastide

Agreed.

From one perspective simpleflow may be taking on too much responsibility in managing multiple worker processes. It may be preferable for simpleflow to provide a simpler single process command line entry point for long polling operations and let the user decide how best to spawn multiples and/or supervise children.

Perhaps a simple, short term fix would be to skip multiprocessing completely if --nb-processes=1. Then for Windows, a user could use --nb-processes=1 and separately solve how to launch and manage several worker invocations.

benstroud avatar Mar 20 '17 21:03 benstroud

I agree with what has been said, and I guess we could make simpleflow a lot simpler by removing the first level of multiprocessing and letting people use what they want to spawn multiple instances.

I rewrote this part a few months ago so we could remove the internal code that used to do that, and it has been a pain. I had a very hard time testing all this correctly (as of today we have time-dependent integration tests that run correctly on my machine but not on @ybastide's one I think).

A simple supervisord manager would probably do a better job, except maybe for the memory consumption (?) and the automatic detection of number of processed (= number of procs if not set ; but I don't think we really need this).

I'd also be fine with a simpler implementation with multiprocessing.Pool for the first level of multiprocessing.

As for the second fork level (poller->real worker process), it's necessary for now as long as the heartbeater works with this. But it may not be like that forever (#239) and some usages don't need a heartbeater anyway.

@benstroud: do you want to try implementing an option for disabling multiprocessing? Or use multiprocessing.Pool optionally as proposed by @ybastide?

jbbarth avatar Mar 21 '17 15:03 jbbarth

I started working towards a PR last night with some minor success. I've managed to get Windows (Py2 tested) supervisor multiprocessing seemingly working with the exception of ensuring a constant number of child processes. (I can probably ensure the children with some more effort)

There were some hiccups in dealing with Windows lack of fork() and related pickling ramifications. Since Windows doesn't fork(), the module imports happen for each child and pickling necessity is compounded.

I had to remove the @immutable class decorators, where used, for reliable pickling. Do you know how essential the forcing of immutable object state is to the overall design? I haven't looked closely, but this seemed somewhat un-Pythonic perhaps. I'm unsure what I may have broken by removing these decorators to allow pickling by multiprocessing.

benstroud avatar Mar 21 '17 15:03 benstroud