zoekt
zoekt copied to clipboard
wip: start toying with wrapping requests withRedFmetrics
In addition to the metrics covered by RED, Google's SRE's "Monitoring Distributed Systems" recommends also capturing the the duration of failed operations as well. Tracking that separately seems important:
- the duration of failed operations can have wildly different timings to successful ones, so we shouldn't conflate the two in an average
- failed operations that are slow can still cause a bad UX
Zoekt seems to collect some info on the duration of failed requests, but it seems like it'd be nice to standardize that.
For now, this PR is toying with implementing the same pattern for bundling the RED metrics (+ the failed operation latency) that sourcegraph/sourcegraph's metrics package and influxDB uses.
It's not attempting to tie tracing/logging in the same module like sourcegraph/sourcegraph's metrics package - I thought it be better to keep it simpler for now.