wip: start toying with wrapping requests withRedFmetrics

Open ggilmore opened this issue 4 years ago • 0 comments

In addition to the metrics covered by RED, Google's SRE's "Monitoring Distributed Systems" recommends also capturing the the duration of failed operations as well. Tracking that separately seems important:

the duration of failed operations can have wildly different timings to successful ones, so we shouldn't conflate the two in an average
failed operations that are slow can still cause a bad UX

Zoekt seems to collect some info on the duration of failed requests, but it seems like it'd be nice to standardize that.

For now, this PR is toying with implementing the same pattern for bundling the RED metrics (+ the failed operation latency) that sourcegraph/sourcegraph's metrics package and influxDB uses.

It's not attempting to tie tracing/logging in the same module like sourcegraph/sourcegraph's metrics package - I thought it be better to keep it simpler for now.

Dec 10 '21 04:12 ggilmore