Can you introduce the process of vector search in the project?
Self Checks
- [X] This is only for bug report, if you would like to ask a question, please head to Discussions.
- [X] I have searched for existing issues search for existing issues, including closed ones.
- [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [X] Please do not modify this template :) and fill in all the required fields.
Dify version
0.5.9
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
我使用同样的文件量化后去搜索一段文本,为什么我自己的项目中(rust+qdrant+openai)跟我连接dify项目中Qdrant向量数据库去直接搜索分值非常低,但是通过dify项目前端进行召回测试分值就非常高。我自己的项目是将文本通过openai量化后直接去qdrant搜索中间没有任何处理,能否介绍下dify项目召回测试的流程,谢谢
I used the same file to quantify and search for a piece of text. Why did my own project (rust+qdrant+openai) connect me to the Qdrant vector database in the dify project to directly search for a very low score, but the recall test was performed through the dify project front-end? The score is very high. My own project is to quantify the text through openai and then directly search it in qdrant without any processing. Can you introduce the process of the recall test of the dify project? Thank you.
✔️ Expected Behavior
No response
❌ Actual Behavior
No response
The process of the recall test in the Dify project involves configuring the Knowledge Base Retrieval Node. This process includes three main steps:
-
Selecting the Query Variable: The query variable typically represents the user's input question. In the "Start" node of conversational applications, the system pre-sets "sys.query" as the user input variable, which is used to query the knowledge base for relevant text segments.
-
Choosing the Knowledge Base for Query: Within the knowledge base retrieval node, you can add an existing knowledge base from Dify. Instructions for creating a knowledge base are available in the Dify documentation.
-
Configuring the Retrieval Strategy: This involves modifying the indexing strategy and retrieval mode for the knowledge base. Dify offers two recall strategies:
- N-choose-1 Recall: Queries are executed through function calling, requiring a system reasoning model.
- Multi-way Recall: A Rerank model is configured for result re-ranking.
For detailed explanations of these settings and strategies, refer to the Dify help documentation [1][2].
Details
About Dosu This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
To continue the conversation, mention @dosu.