dify icon indicating copy to clipboard operation
dify copied to clipboard

celery worker throw exception

Open kenneth-bro opened this issue 1 year ago • 10 comments

Self Checks

  • [X] This is only for bug report, if you would like to ask a question, please head to Discussions.
  • [X] I have searched for existing issues search for existing issues, including closed ones.
  • [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [X] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • [X] Please do not modify this template :) and fill in all the required fields.

Dify version

0.7.2

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

INFO:matplotlib.font_manager:generated new fontManager
2024-08-27 15:20:28,976.976 WARNING [MainThread] [ext_mail.py:51] - MAIL_TYPE is not set
Usage: celery worker [OPTIONS]
Try 'celery worker --help' for help.

Error: Invalid value for '-l' / '--loglevel': '-Q' is not one of 'DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL', 'FATAL'.

what's reason to the question

✔️ Expected Behavior

1、use dataset api put segment 2、datasize: 100 3、dataformat: json

❌ Actual Behavior

1、dataset put error 2、docker api-worker restart auto

No response

kenneth-bro avatar Aug 27 '24 07:08 kenneth-bro

The Celery worker is throwing an exception because the -Q option is being used incorrectly in the command. The -Q option should be placed before the --loglevel option.

The correct command should be:

poetry run python -m celery -A app.celery worker -P gevent -c 1 -Q dataset,generation,mail,ops_trace,app_deletion --loglevel INFO

Additionally, according to the Dify documentation, the correct command for starting the Worker service on Linux or macOS is:

celery -A app.celery worker -P gevent -c 1 --loglevel INFO -Q dataset,generation,mail,ops_trace

For Windows systems, the command is:

celery -A app.celery worker -P solo --without-gossip --without-mingle -Q dataset,generation,mail,ops_trace --loglevel INFO

This ensures that the -Q option is correctly interpreted as specifying the queues and not as an invalid log level [1][2].

To continue talking to Dosu, mention @dosu.

dosubot[bot] avatar Aug 27 '24 07:08 dosubot[bot]

api-1         | Preparing database migration...
worker-1      | Preparing database migration...
api-1         | Start database migration.
worker-1      | Database migration skipped
api-1         | INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
api-1         | INFO  [alembic.runtime.migration] Will assume transactional DDL.
api-1         | Database migration successful!
api-1         | None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
worker-1      | None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
api-1         | [2024-08-27 07:31:50 +0000] [1] [INFO] Starting gunicorn 22.0.0
api-1         | [2024-08-27 07:31:50 +0000] [1] [INFO] Listening at: http://0.0.0.0:5001 (1)
api-1         | [2024-08-27 07:31:50 +0000] [1] [INFO] Using worker: gevent
api-1         | [2024-08-27 07:31:51 +0000] [46] [INFO] Booting worker with pid: 46
worker-1      | /app/api/.venv/lib/python3.10/site-packages/celery/platforms.py:829: SecurityWarning: You're running the worker with superuser privileges: this is
worker-1      | absolutely not recommended!
worker-1      | 
worker-1      | Please specify a different user using the --uid option.
worker-1      | 
worker-1      | User information: uid=0 euid=0 gid=0 egid=0
worker-1      | 
worker-1      |   warnings.warn(SecurityWarning(ROOT_DISCOURAGED.format(
worker-1      |  
worker-1      |  -------------- celery@b0a23f957e77 v5.3.6 (emerald-rush)
worker-1      | --- ***** ----- 
worker-1      | -- ******* ---- Linux-6.10.0-linuxkit-aarch64-with-glibc2.39 2024-08-27 07:31:51
worker-1      | - *** --- * --- 
worker-1      | - ** ---------- [config]
worker-1      | - ** ---------- .> app:         app:0xffff61e289d0
worker-1      | - ** ---------- .> transport:   redis://:**@redis:6379/1
worker-1      | - ** ---------- .> results:     postgresql://postgres:**@db:5432/dify
worker-1      | - *** --- * --- .> concurrency: 1 (gevent)
worker-1      | -- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
worker-1      | --- ***** ----- 
worker-1      |  -------------- [queues]
worker-1      |                 .> app_deletion     exchange=app_deletion(direct) key=app_deletion
worker-1      |                 .> dataset          exchange=dataset(direct) key=dataset
worker-1      |                 .> generation       exchange=generation(direct) key=generation
worker-1      |                 .> mail             exchange=mail(direct) key=mail
worker-1      |                 .> ops_trace        exchange=ops_trace(direct) key=ops_trace
worker-1      | 
worker-1      | [tasks]
worker-1      |   . schedule.clean_embedding_cache_task.clean_embedding_cache_task
worker-1      |   . schedule.clean_unused_datasets_task.clean_unused_datasets_task
worker-1      |   . tasks.add_document_to_index_task.add_document_to_index_task
worker-1      |   . tasks.annotation.add_annotation_to_index_task.add_annotation_to_index_task
worker-1      |   . tasks.annotation.batch_import_annotations_task.batch_import_annotations_task
worker-1      |   . tasks.annotation.delete_annotation_index_task.delete_annotation_index_task
worker-1      |   . tasks.annotation.disable_annotation_reply_task.disable_annotation_reply_task
worker-1      |   . tasks.annotation.enable_annotation_reply_task.enable_annotation_reply_task
worker-1      |   . tasks.annotation.update_annotation_to_index_task.update_annotation_to_index_task
worker-1      |   . tasks.batch_create_segment_to_index_task.batch_create_segment_to_index_task
worker-1      |   . tasks.clean_dataset_task.clean_dataset_task
worker-1      |   . tasks.clean_document_task.clean_document_task
worker-1      |   . tasks.clean_notion_document_task.clean_notion_document_task
worker-1      |   . tasks.deal_dataset_vector_index_task.deal_dataset_vector_index_task
worker-1      |   . tasks.delete_segment_from_index_task.delete_segment_from_index_task
worker-1      |   . tasks.disable_segment_from_index_task.disable_segment_from_index_task
worker-1      |   . tasks.document_indexing_sync_task.document_indexing_sync_task
worker-1      |   . tasks.document_indexing_task.document_indexing_task
worker-1      |   . tasks.document_indexing_update_task.document_indexing_update_task
worker-1      |   . tasks.duplicate_document_indexing_task.duplicate_document_indexing_task
worker-1      |   . tasks.enable_segment_to_index_task.enable_segment_to_index_task
worker-1      |   . tasks.mail_invite_member_task.send_invite_member_mail_task
worker-1      |   . tasks.mail_reset_password_task.send_reset_password_mail_task
worker-1      |   . tasks.ops_trace_task.process_trace_tasks
worker-1      |   . tasks.recover_document_indexing_task.recover_document_indexing_task
worker-1      |   . tasks.remove_app_and_related_data_task.remove_app_and_related_data_task
worker-1      |   . tasks.remove_document_from_index_task.remove_document_from_index_task
worker-1      |   . tasks.retry_document_indexing_task.retry_document_indexing_task
worker-1      |   . tasks.sync_website_document_indexing_task.sync_website_document_indexing_task
worker-1      | 
worker-1      | [2024-08-27 07:31:51,303: INFO/MainProcess] Connected to redis://:**@redis:6379/1
worker-1      | [2024-08-27 07:31:51,306: INFO/MainProcess] mingle: searching for neighbors
worker-1      | [2024-08-27 07:31:52,325: INFO/MainProcess] mingle: all alone
worker-1      | [2024-08-27 07:31:52,341: INFO/MainProcess] pidbox: Connected to redis://:**@redis:6379/1.
worker-1      | [2024-08-27 07:31:52,344: INFO/MainProcess] celery@b0a23f957e77 ready.
worker-1      | [2024-08-27 07:31:52,363: INFO/MainProcess] Task tasks.clean_dataset_task.clean_dataset_task[8957e5d3-e074-4dd2-ad6a-f8a9cf1f4bf2] received
worker-1      | [2024-08-27 07:31:52,365: INFO/MainProcess] Start clean dataset when dataset deleted: 2b96bfa4-d934-4cb0-8752-9d470086d79a
worker-1      | [2024-08-27 07:31:52,430: INFO/MainProcess] Cleaning documents for dataset: 2b96bfa4-d934-4cb0-8752-9d470086d79a
worker-1      | [2024-08-27 07:31:52,444: INFO/MainProcess] Task tasks.clean_dataset_task.clean_dataset_task[ef8e70ef-06d3-455c-a1eb-dd7fb35ab57a] received

I checkout the main branch it seems works for me.

crazywoola avatar Aug 27 '24 07:08 crazywoola

api-1         | Preparing database migration...
worker-1      | Preparing database migration...
api-1         | Start database migration.
worker-1      | Database migration skipped
api-1         | INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
api-1         | INFO  [alembic.runtime.migration] Will assume transactional DDL.
api-1         | Database migration successful!
api-1         | None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
worker-1      | None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
api-1         | [2024-08-27 07:31:50 +0000] [1] [INFO] Starting gunicorn 22.0.0
api-1         | [2024-08-27 07:31:50 +0000] [1] [INFO] Listening at: http://0.0.0.0:5001 (1)
api-1         | [2024-08-27 07:31:50 +0000] [1] [INFO] Using worker: gevent
api-1         | [2024-08-27 07:31:51 +0000] [46] [INFO] Booting worker with pid: 46
worker-1      | /app/api/.venv/lib/python3.10/site-packages/celery/platforms.py:829: SecurityWarning: You're running the worker with superuser privileges: this is
worker-1      | absolutely not recommended!
worker-1      | 
worker-1      | Please specify a different user using the --uid option.
worker-1      | 
worker-1      | User information: uid=0 euid=0 gid=0 egid=0
worker-1      | 
worker-1      |   warnings.warn(SecurityWarning(ROOT_DISCOURAGED.format(
worker-1      |  
worker-1      |  -------------- celery@b0a23f957e77 v5.3.6 (emerald-rush)
worker-1      | --- ***** ----- 
worker-1      | -- ******* ---- Linux-6.10.0-linuxkit-aarch64-with-glibc2.39 2024-08-27 07:31:51
worker-1      | - *** --- * --- 
worker-1      | - ** ---------- [config]
worker-1      | - ** ---------- .> app:         app:0xffff61e289d0
worker-1      | - ** ---------- .> transport:   redis://:**@redis:6379/1
worker-1      | - ** ---------- .> results:     postgresql://postgres:**@db:5432/dify
worker-1      | - *** --- * --- .> concurrency: 1 (gevent)
worker-1      | -- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
worker-1      | --- ***** ----- 
worker-1      |  -------------- [queues]
worker-1      |                 .> app_deletion     exchange=app_deletion(direct) key=app_deletion
worker-1      |                 .> dataset          exchange=dataset(direct) key=dataset
worker-1      |                 .> generation       exchange=generation(direct) key=generation
worker-1      |                 .> mail             exchange=mail(direct) key=mail
worker-1      |                 .> ops_trace        exchange=ops_trace(direct) key=ops_trace
worker-1      | 
worker-1      | [tasks]
worker-1      |   . schedule.clean_embedding_cache_task.clean_embedding_cache_task
worker-1      |   . schedule.clean_unused_datasets_task.clean_unused_datasets_task
worker-1      |   . tasks.add_document_to_index_task.add_document_to_index_task
worker-1      |   . tasks.annotation.add_annotation_to_index_task.add_annotation_to_index_task
worker-1      |   . tasks.annotation.batch_import_annotations_task.batch_import_annotations_task
worker-1      |   . tasks.annotation.delete_annotation_index_task.delete_annotation_index_task
worker-1      |   . tasks.annotation.disable_annotation_reply_task.disable_annotation_reply_task
worker-1      |   . tasks.annotation.enable_annotation_reply_task.enable_annotation_reply_task
worker-1      |   . tasks.annotation.update_annotation_to_index_task.update_annotation_to_index_task
worker-1      |   . tasks.batch_create_segment_to_index_task.batch_create_segment_to_index_task
worker-1      |   . tasks.clean_dataset_task.clean_dataset_task
worker-1      |   . tasks.clean_document_task.clean_document_task
worker-1      |   . tasks.clean_notion_document_task.clean_notion_document_task
worker-1      |   . tasks.deal_dataset_vector_index_task.deal_dataset_vector_index_task
worker-1      |   . tasks.delete_segment_from_index_task.delete_segment_from_index_task
worker-1      |   . tasks.disable_segment_from_index_task.disable_segment_from_index_task
worker-1      |   . tasks.document_indexing_sync_task.document_indexing_sync_task
worker-1      |   . tasks.document_indexing_task.document_indexing_task
worker-1      |   . tasks.document_indexing_update_task.document_indexing_update_task
worker-1      |   . tasks.duplicate_document_indexing_task.duplicate_document_indexing_task
worker-1      |   . tasks.enable_segment_to_index_task.enable_segment_to_index_task
worker-1      |   . tasks.mail_invite_member_task.send_invite_member_mail_task
worker-1      |   . tasks.mail_reset_password_task.send_reset_password_mail_task
worker-1      |   . tasks.ops_trace_task.process_trace_tasks
worker-1      |   . tasks.recover_document_indexing_task.recover_document_indexing_task
worker-1      |   . tasks.remove_app_and_related_data_task.remove_app_and_related_data_task
worker-1      |   . tasks.remove_document_from_index_task.remove_document_from_index_task
worker-1      |   . tasks.retry_document_indexing_task.retry_document_indexing_task
worker-1      |   . tasks.sync_website_document_indexing_task.sync_website_document_indexing_task
worker-1      | 
worker-1      | [2024-08-27 07:31:51,303: INFO/MainProcess] Connected to redis://:**@redis:6379/1
worker-1      | [2024-08-27 07:31:51,306: INFO/MainProcess] mingle: searching for neighbors
worker-1      | [2024-08-27 07:31:52,325: INFO/MainProcess] mingle: all alone
worker-1      | [2024-08-27 07:31:52,341: INFO/MainProcess] pidbox: Connected to redis://:**@redis:6379/1.
worker-1      | [2024-08-27 07:31:52,344: INFO/MainProcess] celery@b0a23f957e77 ready.
worker-1      | [2024-08-27 07:31:52,363: INFO/MainProcess] Task tasks.clean_dataset_task.clean_dataset_task[8957e5d3-e074-4dd2-ad6a-f8a9cf1f4bf2] received
worker-1      | [2024-08-27 07:31:52,365: INFO/MainProcess] Start clean dataset when dataset deleted: 2b96bfa4-d934-4cb0-8752-9d470086d79a
worker-1      | [2024-08-27 07:31:52,430: INFO/MainProcess] Cleaning documents for dataset: 2b96bfa4-d934-4cb0-8752-9d470086d79a
worker-1      | [2024-08-27 07:31:52,444: INFO/MainProcess] Task tasks.clean_dataset_task.clean_dataset_task[ef8e70ef-06d3-455c-a1eb-dd7fb35ab57a] received

I checkout the main branch it seems works for me.

run in docker, env mode is worker, always Back-off restarting failed container

kenneth-bro avatar Aug 27 '24 08:08 kenneth-bro

Are you using windows?

crazywoola avatar Aug 27 '24 08:08 crazywoola

Are you using windows? 您使用的是 Windows 吗?

linux, it's custom build image on source and dockerfile.

kenneth-bro avatar Aug 27 '24 08:08 kenneth-bro

Are you using windows?

the question happend on 0.7.2, it's normal to change 0.7.1

kenneth-bro avatar Aug 27 '24 08:08 kenneth-bro

i know, the script entrypoint.sh add ${LOG_LEVEL}, not default value, docker not LOG_LEVEL env will error.

kenneth-bro avatar Aug 27 '24 09:08 kenneth-bro