marquez icon indicating copy to clipboard operation
marquez copied to clipboard

OpenLineage run start time can be set multiple times

Open collado-mike opened this issue 3 years ago • 0 comments

In some cases, an OpenLineage run can have multiple START events sent (e.g., a Spark SQL job execution will trigger both the SQL start event and the Spark Job event or an Airflow Async operator may fire a start event at the beginning of the task start and another when the async task completes). Marquez updates the start time based on the last start event received (see code here), so we'll have an inaccurate picture of the actual execution time of the OpenLineage job run. We should update the code to correctly capture the actual start time of the event.

collado-mike avatar Feb 06 '23 20:02 collado-mike