`output_status_retry_wait` means the elapsed time from the first retry
- From https://github.com/fluent/fluentd/issues/4233.
The document says output_status_retry_wait means current retry_wait computed from last retry time and next retry time.
https://github.com/fluent/fluent-plugin-prometheus/blob/41fa2df366ceef7b46de154859c852926cba48a9/README.md#L122-L123
However, it actually means the elapsed time from the first retry. We need to fix the value or the document.
https://github.com/fluent/fluent-plugin-prometheus/blob/41fa2df366ceef7b46de154859c852926cba48a9/lib/fluent/plugin/in_prometheus_output_monitor.rb#L194-L207
It is calculated as next_time - start.
start is the time of the first retry, so this value means the elapsed time from the first retry.
- https://github.com/fluent/fluentd/blob/aaa40f71758eb5fa7b3a89084e69a00c0910fb3b/lib/fluent/plugin/in_monitor_agent.rb#L368-L384
- https://github.com/fluent/fluentd/blob/aaa40f71758eb5fa7b3a89084e69a00c0910fb3b/lib/fluent/plugin_helper/retry_state.rb
thanks for your effort @daipom. This fix/clarification is important imo as many dashboard metrics depend on prometheus and can cause noise if someone interprets this in a wrong way.
thanks for your effort @daipom. This fix/clarification is important imo as many dashboard metrics depend on prometheus and can cause noise if someone interprets this in a wrong way.