java-grpc-prometheus icon indicating copy to clipboard operation
java-grpc-prometheus copied to clipboard

Pre-Register Server & Client Metrics

Open cameronbriar opened this issue 4 years ago • 0 comments

Following the guidance of Prometheus' best practices (https://prometheus.io/docs/practices/instrumentation/#avoid-missing-metrics), this library should pre-register its client/server interceptor metrics.

The Go version currently does this successfully (see InitializeMetrics here: https://github.com/grpc-ecosystem/go-grpc-prometheus/blob/master/server_metrics.go#L132)

The problem faced by not pre-registering is, for example, illustrated below:

grpc_server_handled_total{code="OK"}

In this example, even if there were zero requests handled with code=OK, the value should be 0 as long as these server interceptor metrics were being scraped. Another example:

sum by(instance)(rate(grpc_server_handled_total{code!~"OK"}[5m]))
/
sum by(instance)(rate(grpc_server_handled_total[5m]))

Here we're dividing "bad requests" by "total requests". If the service were 100% healthy, this would return NaN. It's expected that this would only result in NaN iff there have been exactly 0 requests total (as we'd still be dividing by 0 even after pre-registration). In any other case, when there's been at least one request, this would not equal NaN (regardless of the values for code).

While there are workarounds, pre-registration seems like the ideal fix here.

Additional resources:

  • https://github.com/grpc-ecosystem/go-grpc-prometheus/issues/2
  • https://www.robustperception.io/existential-issues-with-metrics

cameronbriar avatar Feb 17 '21 00:02 cameronbriar