conc On error callback

Register a callback that is called every time a task returns an error.

Feb 10 '23 14:02 samix73

My current use case for this I want to log the errors as they happen but I don't want to necessarily Stop the pool. Example would be:

func errorCB(ctx context.Context, err error) {
	if err != nil {
		log.Print(err)
	}
}

type handler struct {
	pool *pool.ContextPool
}

func (h *handler) hello(w http.ResponseWriter, req *http.Request) {
	h.pool.Go(func(ctx context.Context) error {
		time.Sleep(time.Second)
		return errors.New("oh no!")
	})
}

func (h *handler) close() error {
	return h.pool.Wait()
}

func main() {
	h := &handler{
		pool: pool.New().ContextPool().WithErrorCallback(errorCB),
	}
	defer func() {
		if err := h.close(); err != nil {
			log.Print(err)
		}
	}()

	http.HandleFunc("/hello", h.hello)

	http.ListenAndServe(":8090", nil)
}

Feb 11 '23 17:02 samix73

Codecov Report

Attention: Patch coverage is 66.66667% with 4 lines in your changes missing coverage. Please review.

Project coverage is 98.37%. Comparing base (daaaa39) to head (fcd160c). Report is 32 commits behind head on main.

Files with missing lines	Patch %	Lines
pool/context_pool.go	0.00%	4 Missing :warning:

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #89      +/-   ##
==========================================
- Coverage   99.28%   98.37%   -0.91%     
==========================================
  Files          12       12              
  Lines         419      431      +12     
==========================================
+ Hits          416      424       +8     
- Misses          3        7       +4

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Feb 12 '23 14:02 codecov-commenter

Thinking out loud a bit here. Apologies if my comments below aren't clear -- I'm struggling to put words to my thoughts.

In general, I am a little concerned about this change because it encourages a pattern of long-lived pools. The problem with long-lived pools is that 1) errors are not propagated until the pool completes, and 2) panics are not propagated until the pool completes. This PR covers case 1, but it does not cover case 2.

In your example, you spawn a goroutine per request. However, the HTTP response writer is never actually used and the goroutine does not complete before the request completes, so the client will never see result or error. In practice, a HTTP server will parallelize requests, so a pool is not needed here. Additionally, handling requests should almost be done synchronously because you still need to write the response to the http.ResponseWriter when you complete the operation.

I know this probably sounds a bit pedantic, but I think adding streaming capabilities to pool might be a mistake because it suggests usage patterns that really aren't great. An ErrorPool handles a set of tasks where the result of all those tasks might be an error. Using the error early means that this is no longer true. Instead, I'd probably recommend using a Pool (rather than an ErrorPool) and handling the errors in a streaming manner. If you handling the errors as you go, the error return value isn't really useful anyways.

Feb 12 '23 18:02 camdencheek

Excellent points, For the case of handling panics I think it wouldn't be a bad idea to add a PanicCallback as well and once the callback is invoked we can clear the recovered value from the Catcher and if no callback registered to propagate the panic as previously

func (p *Catcher) tryRecover() {
	if val := recover(); val != nil {
		rp := NewRecoveredPanic(1, val)
		if p.onPanicCallback == nil {
			p.recovered.CompareAndSwap(nil, &rp)
		} else {
		 	p.onPanicCallback(&rp)
		}
	}
}

In my example I've kept code short for the sake of brevity, think of the goroutine invocation within the HTTP request would just initiate a background process which shouldn't interrupt other invocations of the same task in case of error but the error should be dealt with in some way (reported/logged, etc.).

Could you please elaborate a bit more on "I think adding streaming capabilities to pool might be a mistake", since I believe the Error/Panic Callbacks would/should not really promote any streaming patterns (and why streaming patterns are bad in this case?)?

Moreover, I have been thinking what if we extend these callbacks to Pre/Post hooks which would be respectively invoked before and after the each task. Let me know what you think about it?

Feb 13 '23 17:02 samix73

@camdencheek chiming in here on behalf of Cloud - we recently had a similar use case, where specifically we needed to do error handling after the pool becomes aware of the error (adds it to the underling errPool) - we were also tempted by the ability to have pool jobs error to add streaming handling of errors, and ended up getting bitten by the gotcha that cancellation will not happen until we actually return it. I pondered an error-callback addition for that use case briefly, but we eventually switched to using a stream instead, and building our own error handling - this worked much better because stream offers a more explicit hook for handling the results of each job.

[...] I think adding streaming capabilities to pool might be a mistake because it suggests usage patterns that really aren't great. An ErrorPool handles a set of tasks where the result of all those tasks might be an error. Using the error early means that this is no longer true. Instead, I'd probably recommend using a Pool (rather than an ErrorPool) and handling the errors in a streaming manner. If you handling the errors as you go, the error return value isn't really useful anyways.

+1 on this - IMO if you want to expose "job lifecycle" concepts, it probably should live in stream instead. This also takes care of issues like "does my callback need to be concurrency-safe" (for stream, no, but we could add new types of streams that do everything concurrently), and keeps pool as a purely "I have a group of work, do it all together and then give me aggregated results" model

(aside - now that I think of it, I think I like the old naming of lib/group better now, since it more directly implies a "bounded group of work" whereas pools in Go are often long-lived resource pools, in the context of which the attempted usage in https://github.com/sourcegraph/conc/pull/89#issuecomment-1426828622 kind of makes sense, even though the design of pool is not for this kind of long-lived-pool-of-resources use case)

Feb 13 '23 18:02 bobheadxi

since I believe the Error/Panic Callbacks would/should not really promote any streaming patterns (and why streaming patterns are bad in this case?)?

I think a callback that happens as soon as something happens is what Camden means by "streaming", and as I note in my previous comment I think the intended model is the pool package lets you do a bounded set of work and collect the results, whereas the stream package lets you execute on those results as they come in (and this is where we might want to add extended lifecycle/streaming hooks, instead of having it in pool types)

Feb 13 '23 18:02 bobheadxi