celery worker throw exception
Self Checks
- [X] This is only for bug report, if you would like to ask a question, please head to Discussions.
- [X] I have searched for existing issues search for existing issues, including closed ones.
- [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [X] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
- [X] Please do not modify this template :) and fill in all the required fields.
Dify version
0.7.2
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
INFO:matplotlib.font_manager:generated new fontManager
2024-08-27 15:20:28,976.976 WARNING [MainThread] [ext_mail.py:51] - MAIL_TYPE is not set
Usage: celery worker [OPTIONS]
Try 'celery worker --help' for help.
Error: Invalid value for '-l' / '--loglevel': '-Q' is not one of 'DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL', 'FATAL'.
what's reason to the question
✔️ Expected Behavior
1、use dataset api put segment 2、datasize: 100 3、dataformat: json
❌ Actual Behavior
1、dataset put error 2、docker api-worker restart auto
No response
The Celery worker is throwing an exception because the -Q option is being used incorrectly in the command. The -Q option should be placed before the --loglevel option.
The correct command should be:
poetry run python -m celery -A app.celery worker -P gevent -c 1 -Q dataset,generation,mail,ops_trace,app_deletion --loglevel INFO
Additionally, according to the Dify documentation, the correct command for starting the Worker service on Linux or macOS is:
celery -A app.celery worker -P gevent -c 1 --loglevel INFO -Q dataset,generation,mail,ops_trace
For Windows systems, the command is:
celery -A app.celery worker -P solo --without-gossip --without-mingle -Q dataset,generation,mail,ops_trace --loglevel INFO
This ensures that the -Q option is correctly interpreted as specifying the queues and not as an invalid log level [1][2].
To continue talking to Dosu, mention @dosu.
api-1 | Preparing database migration...
worker-1 | Preparing database migration...
api-1 | Start database migration.
worker-1 | Database migration skipped
api-1 | INFO [alembic.runtime.migration] Context impl PostgresqlImpl.
api-1 | INFO [alembic.runtime.migration] Will assume transactional DDL.
api-1 | Database migration successful!
api-1 | None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
worker-1 | None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
api-1 | [2024-08-27 07:31:50 +0000] [1] [INFO] Starting gunicorn 22.0.0
api-1 | [2024-08-27 07:31:50 +0000] [1] [INFO] Listening at: http://0.0.0.0:5001 (1)
api-1 | [2024-08-27 07:31:50 +0000] [1] [INFO] Using worker: gevent
api-1 | [2024-08-27 07:31:51 +0000] [46] [INFO] Booting worker with pid: 46
worker-1 | /app/api/.venv/lib/python3.10/site-packages/celery/platforms.py:829: SecurityWarning: You're running the worker with superuser privileges: this is
worker-1 | absolutely not recommended!
worker-1 |
worker-1 | Please specify a different user using the --uid option.
worker-1 |
worker-1 | User information: uid=0 euid=0 gid=0 egid=0
worker-1 |
worker-1 | warnings.warn(SecurityWarning(ROOT_DISCOURAGED.format(
worker-1 |
worker-1 | -------------- celery@b0a23f957e77 v5.3.6 (emerald-rush)
worker-1 | --- ***** -----
worker-1 | -- ******* ---- Linux-6.10.0-linuxkit-aarch64-with-glibc2.39 2024-08-27 07:31:51
worker-1 | - *** --- * ---
worker-1 | - ** ---------- [config]
worker-1 | - ** ---------- .> app: app:0xffff61e289d0
worker-1 | - ** ---------- .> transport: redis://:**@redis:6379/1
worker-1 | - ** ---------- .> results: postgresql://postgres:**@db:5432/dify
worker-1 | - *** --- * --- .> concurrency: 1 (gevent)
worker-1 | -- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
worker-1 | --- ***** -----
worker-1 | -------------- [queues]
worker-1 | .> app_deletion exchange=app_deletion(direct) key=app_deletion
worker-1 | .> dataset exchange=dataset(direct) key=dataset
worker-1 | .> generation exchange=generation(direct) key=generation
worker-1 | .> mail exchange=mail(direct) key=mail
worker-1 | .> ops_trace exchange=ops_trace(direct) key=ops_trace
worker-1 |
worker-1 | [tasks]
worker-1 | . schedule.clean_embedding_cache_task.clean_embedding_cache_task
worker-1 | . schedule.clean_unused_datasets_task.clean_unused_datasets_task
worker-1 | . tasks.add_document_to_index_task.add_document_to_index_task
worker-1 | . tasks.annotation.add_annotation_to_index_task.add_annotation_to_index_task
worker-1 | . tasks.annotation.batch_import_annotations_task.batch_import_annotations_task
worker-1 | . tasks.annotation.delete_annotation_index_task.delete_annotation_index_task
worker-1 | . tasks.annotation.disable_annotation_reply_task.disable_annotation_reply_task
worker-1 | . tasks.annotation.enable_annotation_reply_task.enable_annotation_reply_task
worker-1 | . tasks.annotation.update_annotation_to_index_task.update_annotation_to_index_task
worker-1 | . tasks.batch_create_segment_to_index_task.batch_create_segment_to_index_task
worker-1 | . tasks.clean_dataset_task.clean_dataset_task
worker-1 | . tasks.clean_document_task.clean_document_task
worker-1 | . tasks.clean_notion_document_task.clean_notion_document_task
worker-1 | . tasks.deal_dataset_vector_index_task.deal_dataset_vector_index_task
worker-1 | . tasks.delete_segment_from_index_task.delete_segment_from_index_task
worker-1 | . tasks.disable_segment_from_index_task.disable_segment_from_index_task
worker-1 | . tasks.document_indexing_sync_task.document_indexing_sync_task
worker-1 | . tasks.document_indexing_task.document_indexing_task
worker-1 | . tasks.document_indexing_update_task.document_indexing_update_task
worker-1 | . tasks.duplicate_document_indexing_task.duplicate_document_indexing_task
worker-1 | . tasks.enable_segment_to_index_task.enable_segment_to_index_task
worker-1 | . tasks.mail_invite_member_task.send_invite_member_mail_task
worker-1 | . tasks.mail_reset_password_task.send_reset_password_mail_task
worker-1 | . tasks.ops_trace_task.process_trace_tasks
worker-1 | . tasks.recover_document_indexing_task.recover_document_indexing_task
worker-1 | . tasks.remove_app_and_related_data_task.remove_app_and_related_data_task
worker-1 | . tasks.remove_document_from_index_task.remove_document_from_index_task
worker-1 | . tasks.retry_document_indexing_task.retry_document_indexing_task
worker-1 | . tasks.sync_website_document_indexing_task.sync_website_document_indexing_task
worker-1 |
worker-1 | [2024-08-27 07:31:51,303: INFO/MainProcess] Connected to redis://:**@redis:6379/1
worker-1 | [2024-08-27 07:31:51,306: INFO/MainProcess] mingle: searching for neighbors
worker-1 | [2024-08-27 07:31:52,325: INFO/MainProcess] mingle: all alone
worker-1 | [2024-08-27 07:31:52,341: INFO/MainProcess] pidbox: Connected to redis://:**@redis:6379/1.
worker-1 | [2024-08-27 07:31:52,344: INFO/MainProcess] celery@b0a23f957e77 ready.
worker-1 | [2024-08-27 07:31:52,363: INFO/MainProcess] Task tasks.clean_dataset_task.clean_dataset_task[8957e5d3-e074-4dd2-ad6a-f8a9cf1f4bf2] received
worker-1 | [2024-08-27 07:31:52,365: INFO/MainProcess] Start clean dataset when dataset deleted: 2b96bfa4-d934-4cb0-8752-9d470086d79a
worker-1 | [2024-08-27 07:31:52,430: INFO/MainProcess] Cleaning documents for dataset: 2b96bfa4-d934-4cb0-8752-9d470086d79a
worker-1 | [2024-08-27 07:31:52,444: INFO/MainProcess] Task tasks.clean_dataset_task.clean_dataset_task[ef8e70ef-06d3-455c-a1eb-dd7fb35ab57a] received
I checkout the main branch it seems works for me.
api-1 | Preparing database migration... worker-1 | Preparing database migration... api-1 | Start database migration. worker-1 | Database migration skipped api-1 | INFO [alembic.runtime.migration] Context impl PostgresqlImpl. api-1 | INFO [alembic.runtime.migration] Will assume transactional DDL. api-1 | Database migration successful! api-1 | None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used. worker-1 | None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used. api-1 | [2024-08-27 07:31:50 +0000] [1] [INFO] Starting gunicorn 22.0.0 api-1 | [2024-08-27 07:31:50 +0000] [1] [INFO] Listening at: http://0.0.0.0:5001 (1) api-1 | [2024-08-27 07:31:50 +0000] [1] [INFO] Using worker: gevent api-1 | [2024-08-27 07:31:51 +0000] [46] [INFO] Booting worker with pid: 46 worker-1 | /app/api/.venv/lib/python3.10/site-packages/celery/platforms.py:829: SecurityWarning: You're running the worker with superuser privileges: this is worker-1 | absolutely not recommended! worker-1 | worker-1 | Please specify a different user using the --uid option. worker-1 | worker-1 | User information: uid=0 euid=0 gid=0 egid=0 worker-1 | worker-1 | warnings.warn(SecurityWarning(ROOT_DISCOURAGED.format( worker-1 | worker-1 | -------------- celery@b0a23f957e77 v5.3.6 (emerald-rush) worker-1 | --- ***** ----- worker-1 | -- ******* ---- Linux-6.10.0-linuxkit-aarch64-with-glibc2.39 2024-08-27 07:31:51 worker-1 | - *** --- * --- worker-1 | - ** ---------- [config] worker-1 | - ** ---------- .> app: app:0xffff61e289d0 worker-1 | - ** ---------- .> transport: redis://:**@redis:6379/1 worker-1 | - ** ---------- .> results: postgresql://postgres:**@db:5432/dify worker-1 | - *** --- * --- .> concurrency: 1 (gevent) worker-1 | -- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker) worker-1 | --- ***** ----- worker-1 | -------------- [queues] worker-1 | .> app_deletion exchange=app_deletion(direct) key=app_deletion worker-1 | .> dataset exchange=dataset(direct) key=dataset worker-1 | .> generation exchange=generation(direct) key=generation worker-1 | .> mail exchange=mail(direct) key=mail worker-1 | .> ops_trace exchange=ops_trace(direct) key=ops_trace worker-1 | worker-1 | [tasks] worker-1 | . schedule.clean_embedding_cache_task.clean_embedding_cache_task worker-1 | . schedule.clean_unused_datasets_task.clean_unused_datasets_task worker-1 | . tasks.add_document_to_index_task.add_document_to_index_task worker-1 | . tasks.annotation.add_annotation_to_index_task.add_annotation_to_index_task worker-1 | . tasks.annotation.batch_import_annotations_task.batch_import_annotations_task worker-1 | . tasks.annotation.delete_annotation_index_task.delete_annotation_index_task worker-1 | . tasks.annotation.disable_annotation_reply_task.disable_annotation_reply_task worker-1 | . tasks.annotation.enable_annotation_reply_task.enable_annotation_reply_task worker-1 | . tasks.annotation.update_annotation_to_index_task.update_annotation_to_index_task worker-1 | . tasks.batch_create_segment_to_index_task.batch_create_segment_to_index_task worker-1 | . tasks.clean_dataset_task.clean_dataset_task worker-1 | . tasks.clean_document_task.clean_document_task worker-1 | . tasks.clean_notion_document_task.clean_notion_document_task worker-1 | . tasks.deal_dataset_vector_index_task.deal_dataset_vector_index_task worker-1 | . tasks.delete_segment_from_index_task.delete_segment_from_index_task worker-1 | . tasks.disable_segment_from_index_task.disable_segment_from_index_task worker-1 | . tasks.document_indexing_sync_task.document_indexing_sync_task worker-1 | . tasks.document_indexing_task.document_indexing_task worker-1 | . tasks.document_indexing_update_task.document_indexing_update_task worker-1 | . tasks.duplicate_document_indexing_task.duplicate_document_indexing_task worker-1 | . tasks.enable_segment_to_index_task.enable_segment_to_index_task worker-1 | . tasks.mail_invite_member_task.send_invite_member_mail_task worker-1 | . tasks.mail_reset_password_task.send_reset_password_mail_task worker-1 | . tasks.ops_trace_task.process_trace_tasks worker-1 | . tasks.recover_document_indexing_task.recover_document_indexing_task worker-1 | . tasks.remove_app_and_related_data_task.remove_app_and_related_data_task worker-1 | . tasks.remove_document_from_index_task.remove_document_from_index_task worker-1 | . tasks.retry_document_indexing_task.retry_document_indexing_task worker-1 | . tasks.sync_website_document_indexing_task.sync_website_document_indexing_task worker-1 | worker-1 | [2024-08-27 07:31:51,303: INFO/MainProcess] Connected to redis://:**@redis:6379/1 worker-1 | [2024-08-27 07:31:51,306: INFO/MainProcess] mingle: searching for neighbors worker-1 | [2024-08-27 07:31:52,325: INFO/MainProcess] mingle: all alone worker-1 | [2024-08-27 07:31:52,341: INFO/MainProcess] pidbox: Connected to redis://:**@redis:6379/1. worker-1 | [2024-08-27 07:31:52,344: INFO/MainProcess] celery@b0a23f957e77 ready. worker-1 | [2024-08-27 07:31:52,363: INFO/MainProcess] Task tasks.clean_dataset_task.clean_dataset_task[8957e5d3-e074-4dd2-ad6a-f8a9cf1f4bf2] received worker-1 | [2024-08-27 07:31:52,365: INFO/MainProcess] Start clean dataset when dataset deleted: 2b96bfa4-d934-4cb0-8752-9d470086d79a worker-1 | [2024-08-27 07:31:52,430: INFO/MainProcess] Cleaning documents for dataset: 2b96bfa4-d934-4cb0-8752-9d470086d79a worker-1 | [2024-08-27 07:31:52,444: INFO/MainProcess] Task tasks.clean_dataset_task.clean_dataset_task[ef8e70ef-06d3-455c-a1eb-dd7fb35ab57a] receivedI checkout the main branch it seems works for me.
run in docker, env mode is worker, always Back-off restarting failed container
Are you using windows?
Are you using windows? 您使用的是 Windows 吗?
linux, it's custom build image on source and dockerfile.
Are you using windows?
the question happend on 0.7.2, it's normal to change 0.7.1
i know, the script entrypoint.sh add ${LOG_LEVEL}, not default value, docker not LOG_LEVEL env will error.