An error occurred while deleting the dataset
Self Checks
- [X] I have searched for existing issues search for existing issues, including closed ones.
- [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [X] Pleas do not modify this template :) and fill in all the required fields.
Dify version
0.5.8
Cloud or Self Hosted
Self Hosted (Source)
Steps to reproduce
An error occurred while deleting the dataset:
dataset not exists doc_form field:
✔️ Expected Behavior
no error will occur
❌ Actual Behavior
error
Hey @wangrg, great to see you back! Hope everything's been going well since our last chat.
Based on the information you've provided, it seems like there are two potential issues at play here.
The first issue is the ValueError that's being raised when trying to instantiate an object from the IndexProcessorFactory class. This error is likely due to the index_type not being specified or being specified incorrectly. You can refer to the IndexProcessorFactory class for more information on how to properly specify the index_type.
The second issue seems to be a typo in your code. The string 'dataset.done.fom' should likely be 'dataset.done.from'. This could be causing the clean_dataset function to fail, as it might be trying to access a non-existent attribute of the dataset object.
To resolve these issues, you should first ensure that the index_type is being correctly specified when instantiating an object from the IndexProcessorFactory class. Then, correct the typo in your code and try running the clean_dataset_test.py file again.
If you continue to experience issues, please provide more detailed information about your setup and the steps you're taking so that I can better assist you.
Sources
About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
it has fix by the PR : https://github.com/langgenius/dify/pull/2789 , thanks for your feedback~ @wangrg
In addition, the documents in the previous version of the data set supported different doc_form types (qa_model, text_model). Now the doc_form field value of the first document in the data set is inaccurate.
The use of doc_form in the following files in the source code should be inaccurate: api/tasks/clean_notion_document_task.py api/tasks/add_document_to_index_task.py api/tasks/create_segment_to_index_task.py api/tasks/clean_dataset_task.py api/schedule/clean_unused_datasets_task.py api/core/indexing_runner.py -> batch_add_segments
it has fix by the PR : #2789 , thanks for your feedback~ @wangrg
The code snippet that determines whether the document list is empty is too early, otherwise the data in the DatasetProcessRule, DatasetQuery, and AppDatasetJoin tables may not be deleted.
Hi, @wangrg,
I'm helping the team manage their backlog and am marking this issue as stale. The issue "An error occurred while deleting the dataset" was reported by you. It seems that the problem has been resolved by pull request (PR) #2789. Additionally, you mentioned that the documents in the previous version of the data set supported different doc_form types, and the use of doc_form in several files in the source code should be inaccurate.
Could you please confirm if this issue is still relevant to the latest version of the repository? If it is, please let the team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days.
Thank you for your understanding and contribution to the project.
I, Dosu
Thank you for your question. We have optimized the logic here: https://github.com/langgenius/dify/pull/3354 @wangrg