Self Checks

[x] I have read the Contributing Guide and Language Policy.
[x] This is only for bug report, if you would like to ask a question, please head to Discussions.
[x] I have searched for existing issues search for existing issues, including closed ones.
[x] I confirm that I am using English to submit this report, otherwise it will be closed.
[x] 【中文用户 & Non English User】请使用英语提交，否则会被关闭：）
[x] Please do not modify this template :) and fill in all the required fields.

Dify version

1.9.2

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

Steps

Create a new knowledge base
Enable Pipeline / Retrieval Settings for the knowledge base
Upload any document to this knowledge base
Document gets stuck in "Queuing" status forever

Root Cause File: /app/api/services/dataset_service.py Line: ~559 python

if dataset.runtime_mode != "rag_pipeline":
    document_indexing_task.delay(dataset_id, [document.id])

When pipeline is enabled, runtime_mode is set to 'rag_pipeline', causing the condition to be False. The indexing task is never triggered.

Database Evidence All waiting documents have runtime_mode = 'rag_pipeline': sql

SELECT d.runtime_mode, doc.indexing_status, COUNT(*)
FROM datasets d
JOIN documents doc ON d.id = doc.dataset_id
WHERE doc.indexing_status = 'waiting'
GROUP BY d.runtime_mode, doc.indexing_status;

Result: All have runtime_mode = 'rag_pipeline' Successfully indexed documents have runtime_mode = 'general' or were uploaded before enabling pipeline.

✔️ Expected Behavior

Documents should be indexed automatically after upload, regardless of whether Pipeline is enabled or not.

❌ Actual Behavior

Documents remain in "Queuing/Waiting" status forever
No indexing task is sent to Celery worker (verified by checking Redis queues and worker logs)
Documents cannot be used for retrieval
No error message shown to user

Verification Worker is functioning correctly - when manually triggering the task, it processes documents successfully: python

document_indexing_task.apply_async(
    args=[dataset_id, [document_id]],
    queue='dataset'
)

Works perfectly - proves the issue is the task not being sent

Suggested Fix Remove the conditional check in line ~559: python

# Change from:
if dataset.runtime_mode != "rag_pipeline":
    document_indexing_task.delay(dataset_id, [document.id])

# To:
document_indexing_task.delay(dataset_id, [document.id])
Or implement proper pipeline-specific indexing logic if different handling is intended.

Workaround Manually trigger indexing via Python script: python

from app_factory import create_app
app = create_app()
with app.app_context():
    from tasks.document_indexing_task import document_indexing_task
    from models.dataset import Document
    from extensions.ext_database import db
    
    waiting_docs = db.session.query(Document).filter(
        Document.indexing_status == 'waiting'
    ).all()
    
    for doc in waiting_docs:
        document_indexing_task.apply_async(
            args=[str(doc.dataset_id), [str(doc.id)]],
            queue='dataset'
        )

Impact

Severity: High - Pipeline feature completely unusable Affected: All users who enable Pipeline/Retrieval Settings Workaround: Requires database access

Oct 27 '25 10:10 JackyX1996

Hi @JackyX1996! I'm Dosu and I’m helping the dify team.

This bug is caused by a conditional check in the code that prevents document indexing tasks from being triggered when runtime_mode is set to 'rag_pipeline', leaving documents stuck in "waiting" status and unusable for retrieval. There is no alternative indexing logic for pipeline mode in the affected code section, so removing the check will allow standard indexing to trigger and resolve the issue, but be mindful of possible side effects if pipeline-specific logic is added elsewhere in the future. As a workaround, manually triggering the indexing task via Python works.

A recent merged pull request has addressed a related queuing issue by updating Celery worker queue configurations to include priority_pipeline, which should help resolve stuck tasks in newer versions. For persistent session errors in pipeline indexing, upgrading to Dify 2.0.0 beta or newer is recommended, as it includes fixes for session management and pipeline document processing. If you cannot upgrade, ensure you only pass primary key IDs between pipeline steps and always re-query ORM objects inside a new session context.

For more details, see:

To reply, just mention @dosu.

^{How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other}

Oct 27 '25 10:10 dosubot[bot]

It seems this issue has been resolved in the pull request at https://github.com/langgenius/dify/pull/27416/files#diff-c7b1c23e00a1fd0378f6ea41a7dfe1c9e043d766f738dba26860c783b82e8c6a

Nov 09 '25 10:11 water-in-stone

It seems this issue has been resolved in the pull request at https://github.com/langgenius/dify/pull/27416/files#diff-c7b1c23e00a1fd0378f6ea41a7dfe1c9e043d766f738dba26860c783b82e8c6a

What is the pattern for releasing docker images? I pulled an image and deployed a service today, but found that this PR hasn't been released to the docker image yet.

Nov 13 '25 02:11 huixiangyang

[Bug] Documents stuck in "waiting" status when RAG Pipeline is enabled (runtime_mode = 'rag_pipeline')

Self Checks

Dify version

Cloud or Self Hosted

Steps to reproduce

✔️ Expected Behavior

❌ Actual Behavior

Works perfectly - proves the issue is the task not being sent