Alan Ma
Alan Ma
Yep! I linked the [Github Discussion](https://github.com/celery/celery/discussions/7276) from celery Github. This is more just a record to keep track of this behaviour.
The new liveness probe on the Celery Worker (#25561) doesn't fix the root issue but does put in safeguards to reset the worker when workers become catatonic.
Any updates on this issue?
I was able to replicate this issue to the handler close function as well. `mlflow` is clearing existing handlers, closing the s3 handler prematurely. To trace the source of the...
Considering deferred as a "running" state / non-terminal state would let the LocalTaskJobRunner and StandardTaskRunner run indefinitely (disregarding the trigger) when behaviours like #40435 manifest.
Instead of setting the state to SKIPPED, I propose calling [handle_failure](https://github.com/apache/airflow/blob/2.5.1/airflow/models/taskinstance.py#L1753) such that the callbacks are executed.
I believe the question is what does it mean when a dagrun times out. If dagrun timeout means "I need everything to stop including the task instances" then forcing task...