Task fails and cannot read logs. Invalid URL 'http://:8793/log/...': No host supplied
Apache Airflow version
Other Airflow 2 version (please specify below)
If "Other Airflow 2 version" selected, which one?
2.10.1
What happened?
I'm having an issue with an airflow instance where a task fails and I cannot read the logs.
Logs:
*** Could not read served logs: Invalid URL 'http://:8793/log/dag_id=my_dag/run_id=dynamic__apple_3_my_dag_cb353081__2024-09-09T14:41:22.596199__f73c5571719e4f35bf195ded40e5e25b/task_id=cleanup_temporary_directory/attempt=1.log': No host supplied
Event logs:
Executor CeleryExecutor(parallelism=128) reported that the task instance <TaskInstance: my_dag.cleanup_temporary_directory dynamic__apple_3_my_dag_cb353081__2024-09-09T14:41:22.596199__f73c5571719e4f35bf195ded40e5e25b [queued]> finished with state failed, but the task instance's state attribute is queued. Learn more: https://airflow.apache.org/docs/apache-airflow/stable/troubleshooting.html#task-state-changed-externally
Additionally I checked the logs directory for the dag_id/run_id and it's missing the respective task_id folder.
What you think should happen instead?
I should be able to access the logs.
How to reproduce
Not sure how to.
Operating System
Ubuntu 24.04 LTS
Versions of Apache Airflow Providers
No response
Deployment
Other Docker-based deployment
Deployment details
Deployed with docker-compose on Docker Swarm setup on 2 VMs.
Anything else?
Additionally I checked the logs directory for the dag_id/run_id and it's missing the respective task_id folder.
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
having the same issue with 2.10.1 in k8s, using the CeleryKubernetesExecutor.
Could this be related to the inheritance issue that was discussed in https://github.com/apache/airflow/issues/41891?
Additionally I checked the logs directory for the dag_id/run_id and it's missing the respective task_id folder.
Having the same issue on 2.10.0 through a podman-compose
We have upgraded on 2.10.1 like @andrew-stein-sp and we could reproduce the same behavior
got the same behavior since upgrading from version 2.9.3 to 2.10.1. We are using LocalExecutor
I have the same issue with 2.10.0, using the CeleryExecutor. It worked before I upgrading from version 2.9.0 to 2.10.0.
*** Could not read served logs: Invalid URL 'http://:8793/log/dag_id=service_stop/run_id=manual__2024-09-18T09:42:54+09:00/task_id=make_accountlist_task/attempt=1.log': No host supplied
eventlog
Executor CeleryExecutor(parallelism=6) reported that the task instance <TaskInstance: service_stop.make_accountlist_task manual__2024-09-18T09:42:54+09:00 [queued]> finished with state failed, but the task instance's state attribute is queued. Learn more: https://airflow.apache.org/docs/apache-airflow/stable/troubleshooting.html#task-state-changed-externally
Scheduler has a error log at the same hour as eventlog.
[2024-09-18T00:43:18.036+0000] {celery_executor.py:291} ERROR - Error sending Celery task: module 'redis' has no attribute 'client'
Celery Task ID: TaskInstanceKey(dag_id='service_stop', task_id='make_accountlist_task', run_id='manual__2024-09-18T09:42:54+09:00', try_number=1, map_index=-1)
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.12/site-packages/airflow/providers/celery/executors/celery_executor_utils.py", line 220, in send_task_to_executor
result = task_to_run.apply_async(args=[command], queue=queue)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/celery/app/task.py", line 594, in apply_async
return app.send_task(
^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/celery/app/base.py", line 797, in send_task
with self.producer_or_acquire(producer) as P:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/celery/app/base.py", line 932, in producer_or_acquire
producer, self.producer_pool.acquire, block=True,
^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/celery/app/base.py", line 1354, in producer_pool
return self.amqp.producer_pool
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/celery/app/amqp.py", line 591, in producer_pool
self.app.connection_for_write()]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/celery/app/base.py", line 829, in connection_for_write
return self._connection(url or self.conf.broker_write_url, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/celery/app/base.py", line 880, in _connection
return self.amqp.Connection(
^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/kombu/connection.py", line 201, in __init__
if not get_transport_cls(transport).can_parse_url:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/kombu/transport/__init__.py", line 91, in get_transport_cls
_transport_cache[transport] = resolve_transport(transport)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/kombu/transport/__init__.py", line 76, in resolve_transport
return symbol_by_name(transport)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/kombu/utils/imports.py", line 59, in symbol_by_name
module = imp(module_name, package=package, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/importlib/__init__.py", line 90, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 995, in exec_module
File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
File "/home/airflow/.local/lib/python3.12/site-packages/kombu/transport/redis.py", line 282, in <module>
class PrefixedRedisPipeline(GlobalKeyPrefixMixin, redis.client.Pipeline):
^^^^^^^^^^^^
AttributeError: module 'redis' has no attribute 'client'
same issue for us when upgrading to 2.10.2
We’re encountering the same issue as well.
We have switched from Bitnami docker-compose to official Apache docker-compose and we could make it run successfuly :star_struck:
try to check that dags exist in worker, schedule and webserver. I deploy Airflow in K8S and get this error when putting my dags into scheduler(expecting that in will replicate into another pods), but when I check dags folder in worker it was empty
At this time (https://github.com/apache/airflow/issues/42136#issuecomment-2357283522), I used the airflow db upgrade command , but I realized it has been deprecated.
I retried the upgrade using the airflow db migrate -n "2.10.2" command, and it works for me now.
https://airflow.apache.org/docs/apache-airflow/2.10.0/installation/upgrading.html#offline-sql-migration-scripts
We encountered the same problem in Airflow 2.9.3 Here are the Worker logs at the time of the error:
[2024-10-11 10:45:38,544: WARNING/ForkPoolWorker-16] Failed operation _store_result. Retrying 2 more times.
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1910, in _execute_context
self.dialect.do_execute(
File "/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute
cursor.execute(statement, parameters)
psycopg2.OperationalError: could not receive data from server: Connection timed out
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.12/site-packages/celery/backends/database/__init__.py", line 47, in _inner
return fun(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/celery/backends/database/__init__.py", line 117, in _store_result
task = list(session.query(self.task_cls).filter(self.task_cls.task_id == task_id))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/orm/query.py", line 2901, in __iter__
result = self._iter()
^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/orm/query.py", line 2916, in _iter
result = self.session.execute(
^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 1717, in execute
result = conn._execute_20(statement, params or {}, execution_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1710, in _execute_20
return meth(self, args_10style, kwargs_10style, execution_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/sql/elements.py", line 334, in _execute_on_connection
return connection._execute_clauseelement(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1577, in _execute_clauseelement
ret = self._execute_context(
^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1953, in _execute_context
self._handle_dbapi_exception(
File "/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 2134, in _handle_dbapi_exception
util.raise_(
File "/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
raise exception
File "/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1910, in _execute_context
self.dialect.do_execute(
File "/home/airflow/.local/lib/python3.12/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) could not receive data from server: Connection timed out
[SQL: SELECT celery_taskmeta.id AS celery_taskmeta_id, celery_taskmeta.task_id AS celery_taskmeta_task_id, celery_taskmeta.status AS celery_taskmeta_status, celery_taskmeta.result AS celery_taskmeta_result, celery_taskmeta.date_done AS celery_taskmeta_date_done, celery_taskmeta.traceback AS celery_taskmeta_traceback
FROM celery_taskmeta
WHERE celery_taskmeta.task_id = %(task_id_1)s]
[parameters: {'task_id_1': '5d1bef21-fbf4-4feb-9f2c-a54c95b4d738'}]
(Background on this error at: https://sqlalche.me/e/14/e3q8)
I can also note that increasing the sql_alchemy_pool_size parameter to 50 reduced the number of such errors, but did not eliminate them completely.
The same issue in Airflow 2.10.2
TL;DR look for invalid python scripts on the malfunctioning worker. Try creating a DagBag on the worker and see what happens.
# Ensure the AIRFLOW_HOME points to the right location, then run on the worker
>>> from airflow.models import DagBag
>>> DagBag(include_examples=False)
I had this issue too, turns out I edited one of the files through vim, pasted some code, and it pasted tabs instead of spaces, so the file became an invalid python script due to TabError: inconsistent use of tabs and spaces in indentation. After I fixed that, it all went back to normal.
Note that the problematic file doesn't have to be imported by the failing DAG/task. If I understand the issue correctly, a DagBag cannot be created if one of the DAG definition files or their imports isn't a valid python file. Then the issue manifests as DAGs supposedly not being found. In my case, the filesystem isn't shared between the scheduler and the malfunctioning celery worker, and the affected file was unmodified on the scheduler (or modified in a correct way) - so no "big red import error" was displayed in the webserver UI.
Hi @quack39 and all, I am getting same error. I have deployed Airflow [2.9.3] in AKS. But when executing the DAGS getting below error. Don't getting any clue what needs to be updated. I am using Helm [1.15.0] for deployment and using "KubernetesExecuter"
Error
Could not read served logs: HTTPConnectionPool(host='test-dag-config-nlp8suol', port=8793): Max retries exceeded with url: /log/dag_id=test_dag/run_id=manual__2024-11-06T06:43:18.256272+00:00/task_id=config/attempt=1.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8ef9eaecd0>: Failed to establish a new connection: [Errno -2] Name or service not known'))
I had the same problem when changing sequentialexecutor to localexecutor. After some test , I find I must make parallelism equal to CPU core number .
with t3.large(CPU core :2) parallelism =32 (default) NG parallelism =4 most of the tasks are NG , but some OK parallelism =2 all OK with t3.xlarge(CPU core :4) parallelism =4 all OK
But is this the expected action ? I'm not sure .
same issue for us when upgrading to 2.10.3,using k8s.
The same issue in Airflow 2.10.2 with KubernetesExecutor.
I was just able to get the following to work:
task = BashOperator(
task_id="bash_command",
bash_command=bash_command,
retries=2,
retry_delay=timedelta(minutes=1),
do_xcom_push=False,
env={
'PYTHONUNBUFFERED': '1',
'PYTHONFAULTHANDLER': '1', # Helps debug crashes
'FORCE_COLOR': '1' # Preserves color output in logs
},
cwd='/tmp',
append_env=True
)
initially it failed with the same error, then succeeded on retry, I believe because the log stream was still in 'create' and not available for the first attempt.
Same issue here, even if the task is successful
We have same problem. Please help ) 2.10.3
I've ran accidentally into this issue when I defined a custom volume mount in my docker-compose.yml. At first, I defined the mount as part of the airflow-worker service, which apparently overrode the volume mounts imported from x-airflow-common. This then lead to exactly this issue eventually. I resolved this issue for me, by defining my custom mount in x-airflow-common instead. That is how I ended up in this issue; there might of course be different, completely unrelated ways. But if you've encountered this and worked with custom mounts in a docker-environment, double check your docker-compose.yml and the imports done by <<: * operator. Just a quick heads-up in the hope that it helps somebody.
The same issue here with 2.10.3 and CeleryExecutor. Worker log:
Nov 29 01:04:05 ubuntu-s-4vcpu-8gb-amd-fra1-01 airflow[2929312]: [2024-11-29T01:04:05.161+0000] {scheduler_job_runner.py:910} ERROR - Executor CeleryExecutor(parallelism=64) reported that the task instance <TaskInstance: my_dag.task_id scheduled__2024-11-29T00:30:00+00:00 [queued]> finished with state failed, but the task instance's state attribute is queued. Learn more: https://airflow.apache.org/docs/apache-airflow/stable/troubleshooting.html#task-state-changed-externally
same issue deploying on docker with the tutorial supplied on the airflow website. Running test_dags.py as follow : `import datetime
import pendulum
from airflow.models.dag import DAG from airflow.operators.empty import EmptyOperator
now = pendulum.now(tz="UTC") now_to_the_hour = (now - datetime.timedelta(0, 0, 0, 0, 0, 3)).replace(minute=0, second=0, microsecond=0) START_DATE = now_to_the_hour DAG_NAME = "test_dag_v2"
dag = DAG( DAG_NAME, schedule="*/10 * * * *", default_args={"depends_on_past": True}, start_date=pendulum.datetime(2021, 1, 1, tz="UTC"), catchup=False, )
run_this_1 = EmptyOperator(task_id="run_this_1", dag=dag) run_this_2 = EmptyOperator(task_id="run_this_2", dag=dag) run_this_2.set_upstream(run_this_1) run_this_3 = EmptyOperator(task_id="run_this_3", dag=dag) run_this_3.set_upstream(run_this_2)`
The dag is success but i got the following log message :
*** Could not read served logs: Invalid URL 'http://:8793/log/dag_id=test_dag_v1/run_id=manual__2024-11-29T10:15:31.097211+00:00/task_id=run_this_1/attempt=1.log': No host supplied
I verify in my logs folder and i have the log of test_dag_v2 filling.
had the same issue when migrating to 2.10.3 I found this post very useful for troubleshooting https://github.com/apache/airflow/discussions/32234
From my perspective this also occurs on Dummy and EmptyOperators as they don't have any output to log. I think there was some discussion on Github about this behaviour.
Hope it helps :-)
i got the same issue on 2.10.3
can someone add ?keepalives=1&keepalives_idle=30&keepalives_interval=10&keepalives_count=5 to the database connection string and test the behavior?
add ?keepalives=1&keepalives_idle=30&keepalives_interval=10&keepalives_count=5 to the database connection string and test the behavior?
tried that now and still got the same issue, i also downgraded to 2.9.3
Having the same issue on 2.10.2
We are facing the same issue on 2.10.4