airflow icon indicating copy to clipboard operation
airflow copied to clipboard

Simplify metrics handling

Open ferruzzi opened this issue 1 year ago • 6 comments

What happened?

When OTel support was added, it enabled adding tags to metrics. Metrics such as ti.start.<dag_id>.<task_id> became ti.start with the tags for dag_id and task_id. Statsd does not support tagging, so all metrics had to be emitted twice; once with tagging and once assembled as a single long name. Over time, we've forgotten this and some newer metrics have been added which only emit one way or the other.

What you think should happen instead?

I propose that airflow.metrics.statd_logger should be modified to assemble the name and tags into a long name, then any and all double-emitted metrics can be consolidated and future metrics can be easier and more consistently implemented.

How to reproduce

Example of some "double-emitted" metrics: image

in this case, statsd.incr() accepts the tags but does nothing with them. If it did a ("_").join(name, tags) before emitting, then we could drop the redundant metric and simplify future additions.

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

ferruzzi avatar Apr 01 '24 17:04 ferruzzi

Yes. Good idea. Let's do it.

potiuk avatar Apr 01 '24 21:04 potiuk

Not sure it really qualifies as a bug, but it didn't really feel like a "feature request" either. Feel free to adjust my tagging.

I've added "good first issue" because this ins one of those "easy to implement but time consuming" ones IMHO.

If anyone wants to take it on and needs some guidance, hit me up on Slack

ferruzzi avatar Apr 01 '24 22:04 ferruzzi

please assign me

Jack-R-lantern avatar Apr 04 '24 04:04 Jack-R-lantern

I think the easiest answer would be to convert the existing metrics tags dicts to ordereddicts, then in the statsd_logger module you can join them and know they'll always be the same order.

ferruzzi avatar Apr 04 '24 05:04 ferruzzi

Hi I would be willing to work on this.

Ibrahim-Mukherjee avatar Apr 06 '24 14:04 Ibrahim-Mukherjee

@ferruzzi Hi. I came across this article while troubleshooting an issue. It looks like it's okay to combine metric names using dict instead of using ordereddict, what do you think?

Jack-R-lantern avatar Apr 28 '24 12:04 Jack-R-lantern

hmmmmmmm. yeah, I think maybe you are right. As of 3.8, dicts are guaranteed to retain the order they were inserted and we require 3.8+ for Airflow so.... yeah, that should work and make things easier. Sorry about that.

ferruzzi avatar Apr 29 '24 16:04 ferruzzi