druid icon indicating copy to clipboard operation
druid copied to clipboard

prometheus emitter extension doesn't emit any metrics

Open divick opened this issue 3 years ago • 1 comments

Affected Version

0.22.1

Description

I have a small cluster of druid with 3 nodes. One running historical/middlemanager, second running broker/router and third running coordinator.

I wanted to get metrics from my cluster on segments, compaction task, ingestion task etc and for that I initially tried using druid-exporter plugin but that clubs all the metrics into one metric called druid_emitted_metrics. I instead tried to use prometheus-emitter extension which probably doesn't require running any extra utility like druid-exporter. When setting up I came across several hurdles even when making this extension work. The pull-deps utility doesn't seem to download the jars for this plugin so I had to download all the jars manually and place them in libs, extensions folders. With this I was able to go a bit farther and the middleManager and historical seem to be running now and I can see that port 9999 is being listed on. This is the port which I have configured:

druid.extensions.loadList=["mysql-metadata-storage", .....,  "prometheus-emitter"]
druid.emitter.prometheus.port=9999
druid.emitter=prometheus
druid.monitoring.monitors=["org.apache.druid.java.util.metrics.SysMonitor","org.apache.druid.java.util.metrics.JvmMonitor"]

But when I try to see if I get any metrics from port 9999 then it just hangs infinitely.

curl -X GET http://localhost:9999/metrics

Above just hangs and doesn't return any response.

divick avatar Aug 10 '22 08:08 divick

Hello.

Did you see the log from emitter? with add druid.emitter.logging.logLevel=debug config. In my case, the node which has broker and router not shows metric because two processes make prometheus HTTP Server seperately so "port already use" error occurred.

holyachon avatar Aug 12 '22 07:08 holyachon

@holyachon No I have not tried this but I can try and see.

divick avatar Aug 19 '22 07:08 divick

@holyachon To make it easier to debug and change configuration I have provisioned druid in k8s using druid-operator instead of logging to each server separately (master, data, query) and changing and restarting them. With that I can see the metrics are getting emitted. I am guessing that may be it was not emitted earlier because metricsDimensions.json was not specified? Anyhow given now I can see some metrics being emitted but now I see other issues that it complain of unmapped metrics:

[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [query/timeout/count]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/threads/started]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/threads/finished]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/threads/live]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/threads/liveDaemon]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/threads/livePeak]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/gc/mem/max]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/gc/mem/capacity]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/gc/mem/used]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/gc/mem/init]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/gc/mem/max]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/gc/mem/capacity]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/gc/mem/used]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/gc/mem/init]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/gc/mem/max]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/gc/mem/capacity]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/gc/mem/used]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/gc/mem/init]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/gc/mem/max]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/gc/mem/capacity]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/gc/mem/used]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/gc/mem/init]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/heapAlloc/bytes]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [avatica/server/AvaticaProtobufHandler/Handler/RequestTimings]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [avatica/server/AvaticaJsonHandler/Handler/RequestTimings]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [avatica/remote/ProtobufHandler/Handler/Serialization]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [avatica/remote/JsonHandler/Handler/Serialization]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/cpu/user]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/cpu/total]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/cpu/sys]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jvm/cpu/percent]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [query/cache/delta/put/ok]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [query/cache/delta/put/error]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [query/cache/delta/put/oversized]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [query/cache/total/put/ok]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [query/cache/total/put/error]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [query/cache/total/put/oversized]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [query/cache/caffeine/delta/requests]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [query/cache/caffeine/total/requests]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [query/cache/caffeine/delta/loadTime]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [query/cache/caffeine/total/loadTime]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [query/cache/caffeine/delta/evictionBytes]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [query/cache/caffeine/total/evictionBytes]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jetty/numOpenConnections]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jetty/threadPool/total]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jetty/threadPool/idle]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jetty/threadPool/isLowOnThreads]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jetty/threadPool/min]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jetty/threadPool/max]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jetty/threadPool/queueSize]
[MonitorScheduler-0] org.a.d.e.p.PrometheusEmitter - Unmapped metric [jetty/threadPool/busy]

This is seen on broker and other nodes too. As per docs even though, I have extended the metricDimensions.json so that these metrics are mapped. As an example to enable jvm threads metrics I have added org.apache.druid.java.util.metrics.JvmThreadsMonitor in the list of druid.monitoring.monitors list and extended the metricsDimensions as below:

      "jvm/threads/started" : { "dimensions" : [], "type" : "gauge" },
      "jvm/threads/finished" : { "dimensions" : [], "type" : "gauge" },
      "jvm/threads/live" : { "dimensions" : [], "type" : "gauge" },

But then also I see these metrics being unmapped in the logs. Any thoughts? Should I create a separate issue on github for this?

divick avatar Oct 10 '22 15:10 divick

Closing this and will reopen if I get to running k8s manually again instead of on k8s cluster. Opened #13202 to keep track of metrics "unmapped" logs seen in druid.

divick avatar Oct 10 '22 15:10 divick