FSharp.Control.TaskSeq icon indicating copy to clipboard operation
FSharp.Control.TaskSeq copied to clipboard

TaskEx: parallelLimit

Open bartelink opened this issue 3 years ago • 0 comments

Replaces https://github.com/fsprojects/FSharp.Control.TaskSeq/issues/129. TaskEx top level issue: https://github.com/fsprojects/FSharp.Control.TaskSeq/issues/139

Async.Parallel's optional degree of parallelism parameter was added late in the game, but is critical - dumping an arbitrary unbounded number of work items onto the threadpool is not something that should be easy and/or the default thing to do without due consideration for how that will work under stess.

There are some other shortcomings, which frequently lead to various bespoke helpers proliferating:

  • pipelining is painful, necessitating an explicit argument name (e.g. fun computations -> Async.Parallel(computations, maxDegreeOfParallelism=dop) etc) (note this is not the case for Async.Sequential)
  • before v FSharp.Core v 6.0.6, [there was a stack overflow bug that can tear down the process](// https://github.com/dotnet/fsharp/issues/13165) if >1200 items are started with a throttle and cancellation is triggered quickly (so having a layer between Async.Parallel and direct consumption within an app might be useful)

Current proposed APIs (will be updated inline based on any discussion below):

module Async =
    let parallelLimit maxDegreeOfParallelism computations =
        Async.Parallel(computations, maxDegreeOfParallelism = maxDegreeOfParallelism)

NOTES:

  • the naming aligns with that used in Node https://www.npmjs.com/package/run-parallel-limit
  • A common case is to use this to run but await failure of multiple Async<unit> tasks. Having to use |> Async.Ignore<unit[]> is ugly for that (and most people probably do |> Async.Ignore, which prevents the compiler from helping you if your computations start to return values where they previously returned unit)
  • How do you swap back/forth from that to Task, considering cancellation tokens (see #142) and unwrapping AggregateExceptions (see #141). Providing an equivalent of this that works well with task expressions should likely be prototyped alongside any permanent API for this. Example impl. Also, perhaps a Task.sequential might make sense
  • having Throttled in the name is pretty well established in multiple internal library suites, and in posts such as https://www.compositional-it.com/news-blog/improved-asynchronous-support-in-f-4-7

bartelink avatar Jan 05 '23 12:01 bartelink