fluent-plugin-prometheus icon indicating copy to clipboard operation
fluent-plugin-prometheus copied to clipboard

`output_status_retry_wait` means the elapsed time from the first retry

Open daipom opened this issue 2 years ago • 2 comments

  • From https://github.com/fluent/fluentd/issues/4233.

The document says output_status_retry_wait means current retry_wait computed from last retry time and next retry time.

https://github.com/fluent/fluent-plugin-prometheus/blob/41fa2df366ceef7b46de154859c852926cba48a9/README.md#L122-L123

However, it actually means the elapsed time from the first retry. We need to fix the value or the document.

https://github.com/fluent/fluent-plugin-prometheus/blob/41fa2df366ceef7b46de154859c852926cba48a9/lib/fluent/plugin/in_prometheus_output_monitor.rb#L194-L207

It is calculated as next_time - start. start is the time of the first retry, so this value means the elapsed time from the first retry.

  • https://github.com/fluent/fluentd/blob/aaa40f71758eb5fa7b3a89084e69a00c0910fb3b/lib/fluent/plugin/in_monitor_agent.rb#L368-L384
  • https://github.com/fluent/fluentd/blob/aaa40f71758eb5fa7b3a89084e69a00c0910fb3b/lib/fluent/plugin_helper/retry_state.rb

daipom avatar Jul 09 '23 15:07 daipom

thanks for your effort @daipom. This fix/clarification is important imo as many dashboard metrics depend on prometheus and can cause noise if someone interprets this in a wrong way.

raulgupto avatar Jul 09 '23 16:07 raulgupto

thanks for your effort @daipom. This fix/clarification is important imo as many dashboard metrics depend on prometheus and can cause noise if someone interprets this in a wrong way.

raulgupto avatar Jul 09 '23 16:07 raulgupto