celeborn icon indicating copy to clipboard operation
celeborn copied to clipboard

[CELEBORN-1977] Add help/type on prometheus exposed metrics

Open ashangit opened this issue 9 months ago • 0 comments

What changes were proposed in this pull request?

Add help/type on prometheus exposed metrics:

# HELP metrics_UpdateResourceConsumptionTime_Count
# TYPE metrics_UpdateResourceConsumptionTime_Count counter
metrics_UpdateResourceConsumptionTime_Count{instance="192.168.192.143:9098",role="master"} 1 1745390288743

Why are the changes needed?

Datadog agent rely on this type to discover the type of the exposed prometheus metrics: https://docs.datadoghq.com/integrations/openmetrics/#missing-untyped-metrics

Does this PR introduce any user-facing change?

No

How was this patch tested?

Started one master and worker celeborn instance with below metrics.properties config:

*.sink.prometheusServlet.class=org.apache.celeborn.common.metrics.sink.PrometheusServlet
*.sink.jsonServlet.class=org.apache.celeborn.common.metrics.sink.JsonServlet

Then connected to the master and worker metrics endpoint for prometheus. All the metrics now have the help/type annotation.

ashangit avatar Apr 23 '25 06:04 ashangit