Pre-Register Server & Client Metrics
Following the guidance of Prometheus' best practices (https://prometheus.io/docs/practices/instrumentation/#avoid-missing-metrics), this library should pre-register its client/server interceptor metrics.
The Go version currently does this successfully (see InitializeMetrics here: https://github.com/grpc-ecosystem/go-grpc-prometheus/blob/master/server_metrics.go#L132)
The problem faced by not pre-registering is, for example, illustrated below:
grpc_server_handled_total{code="OK"}
In this example, even if there were zero requests handled with code=OK, the value should be 0 as long as these server interceptor metrics were being scraped. Another example:
sum by(instance)(rate(grpc_server_handled_total{code!~"OK"}[5m]))
/
sum by(instance)(rate(grpc_server_handled_total[5m]))
Here we're dividing "bad requests" by "total requests". If the service were 100% healthy, this would return NaN. It's expected that this would only result in NaN iff there have been exactly 0 requests total (as we'd still be dividing by 0 even after pre-registration). In any other case, when there's been at least one request, this would not equal NaN (regardless of the values for code).
While there are workarounds, pre-registration seems like the ideal fix here.
Additional resources:
- https://github.com/grpc-ecosystem/go-grpc-prometheus/issues/2
- https://www.robustperception.io/existential-issues-with-metrics