MaxKB icon indicating copy to clipboard operation
MaxKB copied to clipboard

[Bug] There are 135 prescriptions in the text file, each prescription is empty one or two lines. When importing, it is divided into advanced segments and segments according to empty lines. The first two can be correctly segmented, but it cannot be added later.

Open 35plus opened this issue 11 months ago • 2 comments

Contact Information

No response

MaxKB Version

v.1.10.2

Problem Description

中医名方135个经典处方.txt

上面是文件。 下面是导入时候的截图:

Image

Steps to Reproduce

  1. 导入问题描述里面的附件
  2. 选择高级分段,按照空行,把分段长度调大
  3. 点击预览,就能看到问题。

The expected correct result

No response

Related log output


Additional Information

No response

35plus avatar Mar 14 '25 02:03 35plus

感谢反馈,我们先验证一下

baixin513 avatar Mar 15 '25 02:03 baixin513

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Thanks for the feedback, let's verify it first

shaohuzhang1 avatar Mar 15 '25 02:03 shaohuzhang1

经过排查,发现这个文档里的空行有多个,我们默认的空行是只匹配一个空行的 这种情况需要自己手动写正则表达式来匹配了

Image

zyyfit avatar Apr 03 '25 06:04 zyyfit

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


After troubleshooting, we found that there are multiple blank lines in this document, and our default blank lines only match one blank line. This situation requires you to manually write the regular expression to match.

Image

shaohuzhang1 avatar Apr 03 '25 06:04 shaohuzhang1