FastGPT I hope the interface of the Rerank model can be aligned with xinference

例行检查

[x] 我已确认目前没有类似 features
[x] 我已确认我已升级到最新版本
[x] 我已完整查看过项目 README，已确定现有版本无法满足需求
[x] 我理解并愿意跟进此 features，协助测试和提供反馈
[x] 我理解并认可上述内容，并理解项目维护者精力有限，不遵循规则的 features 可能会被无视或直接关闭

功能描述

现在 Rerank 模型的 API 输入输出定义如下：

export type PostReRankProps = {
  query: string;
  inputs: { id: string; text: string }[];
};
export type PostReRankResponse = { id: string; score?: number }[];

xinference 也提供了 Rerank 的接口，但是他们的输入输出定义是：

// 输入
{
  "model": "bge-reranker",
  "query": "一次测试",
  "documents": [
    "测试一下", "测试两下"
  ],
  "return_documents": false
}
// 输出
{
  "id": "ede8b4cc-add2-11ee-948c-0242ac110008",
  "results": [
    {
      "index": 0,
      "relevance_score": 0.9969996809959412,
      "document": null
    },
    {
      "index": 1,
      "relevance_score": 0.6141782402992249,
      "document": null
    }
  ]
}

看起来 xinference 定义的输入输出参数更完善。我希望 fastgpt 将 rerank 模型的 API 能与 xinference 对齐，并可以为此提交 pr

应用场景

私有化部署

相关示例

Jan 08 '24 03:01 zhanghx0905

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

Routine inspection

[x] I have confirmed that there are currently no similar features
[x] I have confirmed that I have upgraded to the latest version
[x] I have fully reviewed the project README and determined that the existing version cannot meet the needs.
[x] I understand and am willing to follow up on this feature, assist with testing and provide feedback
[x] I understand and agree with the above content, and understand that project maintainers have limited energy. Features that do not follow the rules may be ignored or closed directly

Function description

Now the API input and output of the Rerank model are defined as follows:

export type PostReRankProps = {
  query: string;
  inputs: { id: string; text: string }[];
};
export type PostReRankResponse = { id: string; score?: number }[];

Xinference also provides the Rerank interface, but their input and output definitions are:

// input
{
  "model": "bge-reranker",
  "query": "a test",
  "documents": [
    "Test it once", "Test it twice"
  ],
  "return_documents": false
}
//output
{
  "id": "ede8b4cc-add2-11ee-948c-0242ac110008",
  "results": [
    {
      "index": 0,
      "relevance_score": 0.9969996809959412,
      "document": null
    },
    {
      "index": 1,
      "relevance_score": 0.6141782402992249,
      "document": null
    }
  ]
}

It seems that the input and output parameters defined by xinference are more complete. I hope fastgpt will align the rerank model's API with xinference and can submit a PR for this

Application Scenario

Private deployment

Related Examples

Jan 08 '24 03:01 c121914yu

xinference 不像 GPT，没法成为标准规范，没必要对齐

Jan 08 '24 03:01 c121914yu

xinference 不像 GPT，没法成为标准规范，没必要对齐

感谢回复，我已经自己写了一个兼容 xinference 格式 rerank API 的补丁。

export function reRankRecall({ query, inputs }: PostReRankProps) {
  const model = global.reRankModels[0];

  if (!model || !model?.requestUrl) {
    return Promise.reject('no rerank model');
  }

  // 将 inputs 转换为仅包含文本的数组
  const documents = inputs.map(input => input.text);

  let start = Date.now();
  return instance.post(
    model.requestUrl,
    {
      model: "bge-reranker", // 指定模型
      query,
      documents,
      return_documents: false
    }, {
    timeout: 120000
  }).then(((res) => { return res.data }))
    .then((data: XinferenceReRankResponse) => {
      console.log('rerank time:', Date.now() - start);
      
      // 转换响应格式以匹配 PostReRankResponse
      const output = data.results.map(result => ({
        id: inputs[result.index].id, // 将 index 映射回原始输入的 id
        score: result.relevance_score
      }));
      return output;
    })
    .catch((err) => {
      console.log(err);
      return [];
    });
}

现在我发现另一个问题，重排之后的项目得分普遍很低。如果以重排得分为标准设置filter，就搜索不到任何信息。我现在爬取了一些内网新闻作为知识库，没有做问题补全。

Jan 08 '24 08:01 zhanghx0905

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

Unlike GPT, xinference cannot become a standard specification and does not need to be aligned.

Thanks for the reply. I have written a patch that is compatible with the xinference format rerank API.

export function reRankRecall({ query, inputs }: PostReRankProps) {
  const model = global.reRankModels[0];

  if (!model || !model?.requestUrl) {
    return Promise.reject('no rerank model');
  }

  //Convert inputs to an array containing only text
  const documents = inputs.map(input => input.text);

  let start = Date.now();
  return instance.post(
    model.requestUrl,
    {
      model: "bge-reranker", // Specify the model
      query,
      documents,
      return_documents: false
    }, {
    timeout: 120000
  }).then(((res) => { return res.data }))
    .then((data: XinferenceReRankResponse) => {
      console.log('rerank time:', Date.now() - start);
      
      // Convert response format to match PostReRankResponse
      const output = data.results.map(result => ({
        id: inputs[result.index].id, // Map index back to the id of the original input
        score: result.relevance_score
      }));
      return output;
    })
    .catch((err) => {
      console.log(err);
      return [];
    });
}

Now I found another problem. The scores of the projects after rearrangement are generally very low. If you set a filter based on the reordering score, no information will be found. I have now crawled some intranet news as a knowledge base and have not completed any questions.

Jan 08 '24 08:01 c121914yu

xinference 不像 GPT，没法成为标准规范，没必要对齐

感谢回复，我已经自己写了一个兼容 xinference 格式 rerank API 的补丁。

export function reRankRecall({ query, inputs }: PostReRankProps) {
  const model = global.reRankModels[0];

  if (!model || !model?.requestUrl) {
    return Promise.reject('no rerank model');
  }

  // 将 inputs 转换为仅包含文本的数组
  const documents = inputs.map(input => input.text);

  let start = Date.now();
  return instance.post(
    model.requestUrl,
    {
      model: "bge-reranker", // 指定模型
      query,
      documents,
      return_documents: false
    }, {
    timeout: 120000
  }).then(((res) => { return res.data }))
    .then((data: XinferenceReRankResponse) => {
      console.log('rerank time:', Date.now() - start);
      
      // 转换响应格式以匹配 PostReRankResponse
      const output = data.results.map(result => ({
        id: inputs[result.index].id, // 将 index 映射回原始输入的 id
        score: result.relevance_score
      }));
      return output;
    })
    .catch((err) => {
      console.log(err);
      return [];
    });
}

现在我发现另一个问题，重排之后的项目得分普遍很低。如果以重排得分为标准设置filter，就搜索不到任何信息。我现在爬取了一些内网新闻作为知识库，没有做问题补全。

是的，重排对于语句不完整的内容，分数会非常低。

Jan 08 '24 08:01 c121914yu

xinference 不像 GPT，没法成为标准规范，没必要对齐

感谢回复，我已经自己写了一个兼容 xinference 格式 rerank API 的补丁。
export function reRankRecall({ query, inputs }: PostReRankProps) {
  const model = global.reRankModels[0];

  if (!model || !model?.requestUrl) {
    return Promise.reject('no rerank model');
  }

  // 将 inputs 转换为仅包含文本的数组
  const documents = inputs.map(input => input.text);

  let start = Date.now();
  return instance.post(
    model.requestUrl,
    {
      model: "bge-reranker", // 指定模型
      query,
      documents,
      return_documents: false
    }, {
    timeout: 120000
  }).then(((res) => { return res.data }))
    .then((data: XinferenceReRankResponse) => {
      console.log('rerank time:', Date.now() - start);
      
      // 转换响应格式以匹配 PostReRankResponse
      const output = data.results.map(result => ({
        id: inputs[result.index].id, // 将 index 映射回原始输入的 id
        score: result.relevance_score
      }));
      return output;
    })
    .catch((err) => {
      console.log(err);
      return [];
    });
}
现在我发现另一个问题，重排之后的项目得分普遍很低。如果以重排得分为标准设置filter，就搜索不到任何信息。我现在爬取了一些内网新闻作为知识库，没有做问题补全。
是的，重排对于语句不完整的内容，分数会非常低。

另外重排模块的响应时间在1s以上，我有点担心用户多了它会不会成为系统的瓶颈

Jan 08 '24 09:01 zhanghx0905

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

Unlike GPT, xinference cannot become a standard specification and does not need to be aligned.

Thanks for the reply, I have written a patch that is compatible with the xinference format rerank API.
export function reRankRecall({ query, inputs }: PostReRankProps) {
const model = global.reRankModels[0];

if (!model || !model?.requestUrl) {
return Promise.reject('no rerank model');
}

// Convert inputs to an array containing only text
const documents = inputs.map(input => input.text);

let start = Date.now();
return instance.post(
model.requestUrl,
{
model: "bge-reranker", // Specify the model
query,
documents,
return_documents: false
}, {
timeout: 120000
}).then(((res) => { return res.data }))
.then((data: XinferenceReRankResponse) => {
console.log('rerank time:', Date.now() - start);

// Convert response format to match PostReRankResponse
const output = data.results.map(result => ({
id: inputs[result.index].id, // Map index back to the original input id
score: result.relevance_score
}));
return output;
})
.catch((err) => {
console.log(err);
return [];
});
}
Now I found another problem, the project scores after rearrangement are generally very low. If you set a filter based on the reordering score, no information will be found. I have now crawled some intranet news as a knowledge base and have not completed any questions.
Yes, rearrangement scores will be very low for content with incomplete sentences.

In addition, the response time of the rearrangement module is more than 1 second. I am a little worried that if there are more users, it will become a bottleneck of the system.

Jan 08 '24 09:01 c121914yu

xinference 不像 GPT，没法成为标准规范，没必要对齐

感谢回复，我已经自己写了一个兼容 xinference 格式 rerank API 的补丁。
export function reRankRecall({ query, inputs }: PostReRankProps) {
  const model = global.reRankModels[0];

  if (!model || !model?.requestUrl) {
    return Promise.reject('no rerank model');
  }

  // 将 inputs 转换为仅包含文本的数组
  const documents = inputs.map(input => input.text);

  let start = Date.now();
  return instance.post(
    model.requestUrl,
    {
      model: "bge-reranker", // 指定模型
      query,
      documents,
      return_documents: false
    }, {
    timeout: 120000
  }).then(((res) => { return res.data }))
    .then((data: XinferenceReRankResponse) => {
      console.log('rerank time:', Date.now() - start);
      
      // 转换响应格式以匹配 PostReRankResponse
      const output = data.results.map(result => ({
        id: inputs[result.index].id, // 将 index 映射回原始输入的 id
        score: result.relevance_score
      }));
      return output;
    })
    .catch((err) => {
      console.log(err);
      return [];
    });
}
现在我发现另一个问题，重排之后的项目得分普遍很低。如果以重排得分为标准设置filter，就搜索不到任何信息。我现在爬取了一些内网新闻作为知识库，没有做问题补全。
是的，重排对于语句不完整的内容，分数会非常低。
另外重排模块的响应时间在1s以上，我有点担心用户多了它会不会成为系统的瓶颈

我在3090上运行bge，平均响应在300ms

Jan 10 '24 07:01 c121914yu