dify icon indicating copy to clipboard operation
dify copied to clipboard

The language setting in Segmenting in Question & Answer format don't work

Open libukai opened this issue 1 year ago • 2 comments

Self Checks

  • [X] This is only for bug report, if you would like to ask a question, please head to Discussions.
  • [X] I have searched for existing issues search for existing issues, including closed ones.
  • [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [X] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • [X] Please do not modify this template :) and fill in all the required fields.

Dify version

0.9.2

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

CleanShot 2024-10-18 at 15 47 19@2x

Even the language is set to Chinese Simplified, the result segment of the index is English.

CleanShot 2024-10-18 at 15 49 46@2x

✔️ Expected Behavior

No response

❌ Actual Behavior

No response

libukai avatar Oct 18 '24 07:10 libukai

There are similar issues and discussions related to the language setting in the Question & Answer segmentation format not working as expected:

  1. An open issue titled "Q&A format choose Chinese. Shows 0 paragraphs. Other languages work normally." suggests that the problem might be caused by the _format_split_text and format_split_text functions. If the input text contains special characters like "*", these functions may return an empty list, resulting in an empty qa_preview. Additionally, if the output is not in the expected format, the qa_preview will also be empty.

  2. A closed issue titled "Adopt Q&A segmentation mode, segmentation use option: simplified Chinese, or a large number of English segments will be generated." indicates that testing with OpenAI GPT-3.5 worked fine. It was suggested to emphasize the use of language above the prompt and to check out the latest code for improvements.

These references might provide insights or solutions to the issue you're experiencing with the language setting in the segmentation format [1][2].

To continue talking to Dosu, mention @dosu.

dosubot[bot] avatar Oct 18 '24 07:10 dosubot[bot]

None of the issues provide useful info. I changed the model and it still generated English segments.

And the language setting auto back to English after the process finished.

libukai avatar Oct 18 '24 10:10 libukai

I'm having the same problem

rudnypc avatar Nov 05 '24 18:11 rudnypc

Hi, @libukai. I'm Dosu, and I'm helping the Dify team manage their backlog and am marking this issue as stale.

Issue Summary

  • Bug in Dify version 0.9.2: Language setting for segmenting defaults to English instead of Chinese Simplified.
  • Occurs in a self-hosted Docker environment.
  • I referenced similar issues, but solutions were not applicable.
  • Another user, "rudnypc," confirmed the same issue.

Next Steps

  • Please confirm if this issue is still relevant to the latest version of the Dify repository by commenting here.
  • If there is no further activity, this issue will be automatically closed in 15 days.

Thank you for your understanding and contribution!

dosubot[bot] avatar Dec 06 '24 16:12 dosubot[bot]