dify icon indicating copy to clipboard operation
dify copied to clipboard

About Retrieval Setting Top K problem

Open quaggalinux opened this issue 10 months ago • 3 comments

Self Checks

  • [x] This is only for bug report, if you would like to ask a question, please head to Discussions.
  • [x] I have searched for existing issues search for existing issues, including closed ones.
  • [x] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [x] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • [x] Please do not modify this template :) and fill in all the required fields.

Dify version

1.1.3

Cloud or Self Hosted

Self Hosted (Source)

Steps to reproduce

I have deployed TEI rerank model bge-reranker-v2-m3. While loading in a new pdf file into my new knowledge base, I have the setting in "Vector Search" section to be enable rerank model "bge-reranker-v2-m3" and top-k of 10. The Chunk Settings above is set to "Parent-child", and Index Method is "High Quality".

✔️ Expected Behavior

After the file has finished processing and shows "available", checking the "Chucking settings" of that file should show that the top-k value is 10.

❌ Actual Behavior

The top-k value is only 2 in the Chucking Setting, after finished processing the file and shows "available".

quaggalinux avatar Apr 01 '25 17:04 quaggalinux

I am not sure if this is intended feature for Dify of this auto adjustment of the top-k value, or is it because of my mistakes on the settings?

quaggalinux avatar Apr 01 '25 17:04 quaggalinux

The auto adjustment of the top-k value in the "Chucking settings" of Dify is an intended feature. The TopK value automatically adjusts based on the chosen model’s context window to optimize the retrieval process by selecting the most relevant text chunks according to the model's capabilities [1][2].

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

dosubot[bot] avatar Apr 01 '25 17:04 dosubot[bot]

I believe I have found the root cause of this issue. The steps above have nothing to do with the changing top-k value setting, however changing the name of the Knowledge Base does. After the name change, all top-k value settings will be changed to a "default number". To change it back, I had to click in that Knowledge base name -> Settings, and change the Retrieval Setting there for the top-k value. No other "default reverts" due to Knowledge base name change are found yet.

quaggalinux avatar Apr 01 '25 18:04 quaggalinux

Hi, @quaggalinux. I'm Dosu, and I'm helping the Dify team manage their backlog. I'm marking this issue as stale.

Issue Summary:

  • You reported an issue with the top-k value setting for a rerank model in Dify version 1.1.3.
  • The top-k value set to 10 was automatically adjusted to 2, which is an intended feature for optimization.
  • Changing the Knowledge Base name resets the top-k value to a default setting.
  • This reset can be manually corrected in the Retrieval Settings.

Next Steps:

  • Please confirm if this issue is still relevant to the latest version of the Dify repository. If so, you can keep the discussion open by commenting on the issue.
  • Otherwise, this issue will be automatically closed in 15 days.

Thank you for your understanding and contribution!

dosubot[bot] avatar May 02 '25 16:05 dosubot[bot]

I have upgraded to 1.3.1 but this issue is the same.

quaggalinux avatar May 02 '25 17:05 quaggalinux

Hi, @quaggalinux. I'm Dosu, and I'm helping the Dify team manage their backlog and am marking this issue as stale.

Issue Summary:

  • You reported that setting top-k to 10 for the bge-reranker-v2-m3 model does not persist after processing a new PDF, reverting to 2.
  • The auto-adjustment of top-k based on the model's context window is an intended feature.
  • Changing the Knowledge Base name resets top-k to a default value, requiring manual correction in Retrieval Settings.
  • The issue remains unresolved and was last noted as still present in version 1.3.1.

Next Steps:

  • Please confirm if this issue is still relevant with the latest version of Dify by commenting below.
  • If no further updates are provided, I will automatically close this issue in 15 days.

Thank you for your understanding and contribution!

dosubot[bot] avatar Aug 27 '25 16:08 dosubot[bot]