airflow icon indicating copy to clipboard operation
airflow copied to clipboard

[AIP-49] Airflow support for OTEL traces

Open howardyoo opened this issue 1 year ago • 1 comments

Description

We have had a successful release of OTEL support for Airflow that covered metrics part. Now, the time is right to move into the next chapter, where we'd be adding the new Tracing support for Airflow. Tracing will complement the additional details that was not present in metrics. Metrics are lightweight and does have its usefulness in detecting the overall health status of Airflow. However, when you try to dig deeper into debugging the root cause, metrics often does not contain that detail and you have to drill down with things like logs and traces.

OTEL traces also cover the interactions between different airflow components together and shows the wholistic picture of how each DAG run flew through schedulers, dag processors, workers, and eventually executors, providing the necessary linkages to understand how a single cycle of DAG run ran, and what kind of components were involved, and ultimately show how each task results ended up with.

The last part of the OTEL implementation would involve logging support which will have logging handlers that will emit log messages using OTEL standard to an OTEL compatible endpoint such as otel collector. However, this is yet another separate initiative that is not yet part of the scope of this work, and should follow after the tracing support is done.

for more details related to this, please look for AIP-49.

Use case/motivation

For each dag runs initiated and ran, Airflow should keep track of its progress in full cycle (from start of the DAG to finish), and emit traces and spans pertaining to the OpenTelemetry specification, using OTEL SDK and API. The configuration should also have options to configure how the tracing would work.

Related issues

No response

Are you willing to submit a PR?

  • [X] Yes I am willing to submit a PR!

Code of Conduct

howardyoo avatar Feb 27 '24 18:02 howardyoo

We need better monitoring in astronomer and this would help a lot.

KimboTodd avatar May 01 '24 20:05 KimboTodd

Hi @howardyoo

Just wanted to check if this work still planned?

Thanks!

woody1872 avatar May 22 '24 21:05 woody1872

Yes, of cource, @woody1872 ! I have a PR made ready for this, and am awaiting any abled airflow contributors to review and approve. Still waiting, though. https://github.com/apache/airflow/pull/37948

howardyoo avatar May 22 '24 23:05 howardyoo