On error callback
Register a callback that is called every time a task returns an error.
My current use case for this I want to log the errors as they happen but I don't want to necessarily Stop the pool. Example would be:
func errorCB(ctx context.Context, err error) {
if err != nil {
log.Print(err)
}
}
type handler struct {
pool *pool.ContextPool
}
func (h *handler) hello(w http.ResponseWriter, req *http.Request) {
h.pool.Go(func(ctx context.Context) error {
time.Sleep(time.Second)
return errors.New("oh no!")
})
}
func (h *handler) close() error {
return h.pool.Wait()
}
func main() {
h := &handler{
pool: pool.New().ContextPool().WithErrorCallback(errorCB),
}
defer func() {
if err := h.close(); err != nil {
log.Print(err)
}
}()
http.HandleFunc("/hello", h.hello)
http.ListenAndServe(":8090", nil)
}
Codecov Report
Attention: Patch coverage is 66.66667% with 4 lines in your changes missing coverage. Please review.
Project coverage is 98.37%. Comparing base (
daaaa39) to head (fcd160c). Report is 32 commits behind head on main.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| pool/context_pool.go | 0.00% | 4 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## main #89 +/- ##
==========================================
- Coverage 99.28% 98.37% -0.91%
==========================================
Files 12 12
Lines 419 431 +12
==========================================
+ Hits 416 424 +8
- Misses 3 7 +4
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Thinking out loud a bit here. Apologies if my comments below aren't clear -- I'm struggling to put words to my thoughts.
In general, I am a little concerned about this change because it encourages a pattern of long-lived pools. The problem with long-lived pools is that 1) errors are not propagated until the pool completes, and 2) panics are not propagated until the pool completes. This PR covers case 1, but it does not cover case 2.
In your example, you spawn a goroutine per request. However, the HTTP response writer is never actually used and the goroutine does not complete before the request completes, so the client will never see result or error. In practice, a HTTP server will parallelize requests, so a pool is not needed here. Additionally, handling requests should almost be done synchronously because you still need to write the response to the http.ResponseWriter when you complete the operation.
I know this probably sounds a bit pedantic, but I think adding streaming capabilities to pool might be a mistake because it suggests usage patterns that really aren't great. An ErrorPool handles a set of tasks where the result of all those tasks might be an error. Using the error early means that this is no longer true. Instead, I'd probably recommend using a Pool (rather than an ErrorPool) and handling the errors in a streaming manner. If you handling the errors as you go, the error return value isn't really useful anyways.
Excellent points, For the case of handling panics I think it wouldn't be a bad idea to add a PanicCallback as well and once the callback is invoked we can clear the recovered value from the Catcher and if no callback registered to propagate the panic as previously
func (p *Catcher) tryRecover() {
if val := recover(); val != nil {
rp := NewRecoveredPanic(1, val)
if p.onPanicCallback == nil {
p.recovered.CompareAndSwap(nil, &rp)
} else {
p.onPanicCallback(&rp)
}
}
}
In my example I've kept code short for the sake of brevity, think of the goroutine invocation within the HTTP request would just initiate a background process which shouldn't interrupt other invocations of the same task in case of error but the error should be dealt with in some way (reported/logged, etc.).
Could you please elaborate a bit more on "I think adding streaming capabilities to pool might be a mistake", since I believe the Error/Panic Callbacks would/should not really promote any streaming patterns (and why streaming patterns are bad in this case?)?
Moreover, I have been thinking what if we extend these callbacks to Pre/Post hooks which would be respectively invoked before and after the each task. Let me know what you think about it?
@camdencheek chiming in here on behalf of Cloud - we recently had a similar use case, where specifically we needed to do error handling after the pool becomes aware of the error (adds it to the underling errPool) - we were also tempted by the ability to have pool jobs error to add streaming handling of errors, and ended up getting bitten by the gotcha that cancellation will not happen until we actually return it. I pondered an error-callback addition for that use case briefly, but we eventually switched to using a stream instead, and building our own error handling - this worked much better because stream offers a more explicit hook for handling the results of each job.
[...] I think adding streaming capabilities to pool might be a mistake because it suggests usage patterns that really aren't great. An ErrorPool handles a set of tasks where the result of all those tasks might be an error. Using the error early means that this is no longer true. Instead, I'd probably recommend using a Pool (rather than an ErrorPool) and handling the errors in a streaming manner. If you handling the errors as you go, the error return value isn't really useful anyways.
+1 on this - IMO if you want to expose "job lifecycle" concepts, it probably should live in stream instead. This also takes care of issues like "does my callback need to be concurrency-safe" (for stream, no, but we could add new types of streams that do everything concurrently), and keeps pool as a purely "I have a group of work, do it all together and then give me aggregated results" model
(aside - now that I think of it, I think I like the old naming of lib/group better now, since it more directly implies a "bounded group of work" whereas pools in Go are often long-lived resource pools, in the context of which the attempted usage in https://github.com/sourcegraph/conc/pull/89#issuecomment-1426828622 kind of makes sense, even though the design of pool is not for this kind of long-lived-pool-of-resources use case)
since I believe the Error/Panic Callbacks would/should not really promote any streaming patterns (and why streaming patterns are bad in this case?)?
I think a callback that happens as soon as something happens is what Camden means by "streaming", and as I note in my previous comment I think the intended model is the pool package lets you do a bounded set of work and collect the results, whereas the stream package lets you execute on those results as they come in (and this is where we might want to add extended lifecycle/streaming hooks, instead of having it in pool types)