client_python icon indicating copy to clipboard operation
client_python copied to clipboard

gauge.set_function() doesn't work in multiprocess mode

Open MatthewMaclean opened this issue 5 years ago • 6 comments

Multiprocess mode's collect() reads the registry files and aggregates metrics that have been written to prometheus_multiproc_dir.

This doesn't work with gauge.set_function() which does not record its value. The provided function is just called during collection.

That means with the following code:

registry = CollectorRegistry()
Gauge("test", "test", registry=self._registry).set_function(lambda: 100)
multiprocess.MultiProcessCollector(registry)

The output will be:

# HELP test test
# TYPE test gauge
test 100.0
# HELP test Multiprocess metric
# TYPE test gauge
test{pid="10705"} 0.0

Current side effects:

  • If any mode other than all or liveall is used, the pid tag won't be included. This results in duplicate metrics being reported to Prometheus. Prometheus currently only uses the first metric it reads, which is non-deterministic due to iteration over the registry's dictionary.
  • If registry=None to avoid double reporting, only the default value of 0.0 is reported.
  • Current way to work around it is to use the mode all and to ignore gauges in Prometheus and with the tag pid.

Proposal: I'm not sure how you could incorporate set_function into the multiprocess registry and I'm not convinced how useful of a feature it would be. Is it reasonable to add a new multiprocess_mode: exclude which would prevent the incorrect 0.0 value being reported? Or would it be better to just add documentation to recommend using two independent registries?

MatthewMaclean avatar Jan 29 '20 20:01 MatthewMaclean

This classifies as a custom collector, which it's not possible to make work with multiprocess mode.

brian-brazil avatar Jan 29 '20 21:01 brian-brazil

Makes sense, I'll look/ask about ways to work around that in the mailing list.

MatthewMaclean avatar Jan 30 '20 15:01 MatthewMaclean

Given that multiprocess is getting some traction is it worth re-opening this?

hdost avatar Dec 06 '22 13:12 hdost

I'm still not sure how this could be done in multiprocess mode. You would need to run the function for each process which is not a pattern we have today and would need to integrate with whatever is starting the process I believe. If there is a simpler way I am all ears, but otherwise I think it should probably be kept as closed.

csmarchbanks avatar Dec 14 '22 20:12 csmarchbanks

Perhaps just documentation or a failure logged. Right now you just have to wonder

hdost avatar Dec 18 '22 13:12 hdost

:+1: Documentation on the function is a good idea to avoid confusion.

csmarchbanks avatar Dec 19 '22 19:12 csmarchbanks