dify icon indicating copy to clipboard operation
dify copied to clipboard

Can we use Reciprocal rank fusion (RRF) as an alternative to Rerank models to save costs?

Open jamie0725 opened this issue 1 year ago • 0 comments

Self Checks

  • [X] I have searched for existing issues search for existing issues, including closed ones.
  • [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [X] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • [X] Please do not modify this template :) and fill in all the required fields.

1. Is this request related to a challenge you're experiencing? Tell me about your story.

Hi, currently, if we want to use multi-path retrieval or retrieve from multiple knowledge bases, we have to configure a Rerank API, which not only increases costs but also security concerns (because we need to send our data to API vendors).

I can see the value of Rerank models, but I see them more of an advanced setting, rather than a necessity.

For example, can we add a feature to support RRF in the reranking stage?

score = 0.0
for q in queries:
    if d in result(q):
        score += 1.0 / ( k + rank( result(q), d ) )
return score

# where
# k is a ranking constant
# q is a query in the set of queries
# d is a document in the result set of q
# result(q) is the result set of q
# rank( result(q), d ) is d's rank within the result(q) starting from 1

2. Additional context or comments

No response

3. Can you help us with this feature?

  • [ ] I am interested in contributing to this feature.

jamie0725 avatar Jul 11 '24 02:07 jamie0725