rust-prometheus icon indicating copy to clipboard operation
rust-prometheus copied to clipboard

Async collector/gather interface

Open robo-corg opened this issue 5 years ago • 3 comments

This might be a bad fit given async metrics don't come up much but I figured I would file it here since I ended up writing something like this for work and it might be nice to upstream it.

Is your feature request related to a problem? Please describe. It would be nice to be able to easily write gauges backed by futures or streams such as a sqlx query.

Describe the solution you'd like If gather and/or Collector were (optionally?) async this might allow for async metrics. That or some notion of an AsyncRegistry which could possibly live in another crate entirely.

Describe alternatives you've considered

  1. Implementing collect and using futures::executor::block_on inside seems to work but this potentially ties up the thread gather is being run on. Given this same thread is likely being served inside a future if the metrics scrape endpoint is using actix, or warp etc... this seems at best a waste but could potentially tie up important worker threads.
  2. Perform a pass before Registry::gather is called which updates normal synchronous metrics. I went with this and wrapped the registry with a type I ended up calling an AsyncRegistry. Seems to work reasonably well but all the machinery for doing this needs to be built.

Additional context

Here is an example pulled from the crate I wrote that tacks on the concept of an AsyncRegistry on top of the normal Prometheus::registry:

let registry = AsyncRegistry::new();

let workflow_state_counts_opts = Opts::new(
    "workflow_state_counts",
    "Number of workflows in state",
);

// `register_async` needs a DbConnectionSource. This should be backed by your application's
// database pool.
let pool = PgPool::new("postgresql://localhost/postgres").await
   .expect("Create db pool");

// Register the async metric labeled by `workflow_state`. See the docs for DbIntGauge
// on the conventions for writing queries.
registry
    .register_async(DbMetric::new(
        pool,
        DbIntGauge::new(
            r#"
        SELECT workflow_state, count(uuid) as value
        FROM workflows
        GROUP BY workflow_state
        "#,
            &["workflow_state"],
            IntGaugeVec::new(workflow_state_counts_opts, &["workflow_state"])
                .expect("Error creating workflow_state_counts metric"),
        ),
    ))
    .expect("Error registering workflow_state_counts metric");

I will probably need to split these more expensive to collect metrics into another scrape potentially but other than that it seems to work nicely.

I understand if async is out of the scope of this crate or considered an anti-feature. I figured I would check in here first since first class async support would probably a lot cleaner than anything tacked on.

robo-corg avatar Dec 15 '20 21:12 robo-corg

First of all thanks for the detailed feature request @robo-corg.

I understand if async is out of the scope of this crate or considered an anti-feature.

As a first thought I find supporting async collectors to be too specific for this general purpose library. I would not consider it an anti-feature. In case there are multiple users interested in async collectors I think it is worth considering / working on a design.

Supporting async collectors is especially interesting for rust-prometheus users building exporters instead of natively instrumenting their logic.

I figured I would check in here first since first class async support would probably a lot cleaner than anything tacked on.

Agreed. I would estimate this to require some larger design work.

What do others think?

mxinden avatar Dec 16 '20 16:12 mxinden

I think as a low-hanging fruit you should mention something about this in the README.md, so when some newbie comes around (like me), will have a hint. My question is simply whether is it safe to call this library from async code or not? I imagine it is, but my thread will do the calculation, so I should be aware of that. Anyway, great library, it took me 5 minutes to set up an endpoint with a counter.

zsedem avatar Dec 24 '21 06:12 zsedem

It's been a while since this discussion was last updated, but for people building exporters this would really be helpful. It would allow to tie the (async) collection to the actual scrapes which I'd say is more reliable than collecting the data in the background and then letting prometheus just scrape the internal "cache".

Usually having to maintain two scrape intervals results in unreliable or lost precision.

kwiesmueller avatar Sep 02 '22 04:09 kwiesmueller