streamz icon indicating copy to clipboard operation
streamz copied to clipboard

Consider supervision in streams

Open pragdave opened this issue 11 years ago • 4 comments

Given that you have no control over what the request process does in terms of spawning other processes, would you be safer creating a supervisor and a process per request, limiting the damage if something does go wrong?

pragdave avatar Jul 22 '14 02:07 pragdave

This ties in with https://github.com/hamiltop/streamz/pull/7

Supervision vs just linking is an area that is a little gray for me.

I think specific examples will make this clearer. So in the case of a task stream, what would be the benefit?

Are there other specific examples you can point out?

hamiltop avatar Jul 22 '14 04:07 hamiltop

One answer came via the mailing list. Supervisors allow clean shutdown. In the case of task streams, a supervisor would allow us to cancel all tasks in a stream once we've enumerated. This might be especially useful in first_completed_of, provided there are no side effects.

How would a supervisor make the stream more resilient from failure? first_completed_of works as an example again. We may not need 100‰ success. We may only need one task to succeed. A supervisor would allow us to be flexible about that.

This also makes me wonder about the existing Stream.filter. If a filter throws an exception, does it crash the process? A case could be made that unsuccessful execution of the filter function should be considered falsy and those elements not included in the resulting stream.

Right now I accomplish shutdown by linking everything to the collector process and then exiting. All the stream pids should therefore die. I'm not perfectly clear on how a supervisor would improve that.

There seem to be benefits though, without much drawback. I'll look at adding supervisors. More examples would still be helpful though, so if you can think of any good ones, let me know.

hamiltop avatar Jul 22 '14 11:07 hamiltop

I believe one_for_one supervision would allow for the scenario whereby a non-consumed process crashes and is restarted without affecting the consumed streams.

rps avatar Aug 01 '14 19:08 rps

If we were to use a supervisor, I feel like most of our processes would be temporary and not restarted. On Aug 1, 2014 12:13 PM, "Rich Parrish" [email protected] wrote:

In a nested group_by, I'd anticipate that you'd want to ensure the order of the stream restarts using a rest_for_one strategy. Otherwise the restarted children may call on a crashed parent process.

— Reply to this email directly or view it on GitHub https://github.com/hamiltop/streamz/issues/8#issuecomment-50923501.

hamiltop avatar Aug 01 '14 19:08 hamiltop