drunkpig

Results 91 comments of drunkpig

too many files changed.

@Jamly7 After saving the path of equation screenshot, you should modify functions called `pipe_mk_markdown` and `pipe_mk_uni_format` in `magic_pdf/pipe/XXPipe.py`

@kevinwei1975 We are in the process of preparing a video tutorial, and you will need to wait for about 3 weeks

@beiluo We will test this solution.

@yiyibooks The superscript citations in the paper have been deliberately removed, thinking that the superscripts affect the readability.

@yiyibooks We cannot extract information such as text color and bold formatting from scanned PDFs, but we can obtain this information from text-based PDFs. This work deviates somewhat from our...

多进程共享一个模型实例基本不用想了,不容易控制。

html 表格是个好的选择 @shudct

@WyHy 目前版面模型支持较好的文档格式大致有中英文论文,科技杂志、中文财报、中英文正规出版物书籍。你给出的体检单看上去和我们模型训练中使用较多的数据风格上有较大差距,因此效果表现不佳。 是否有意提供一批这样的数据,用于模型在这方面的增强?

![a5b9aafe868c4165f180fb4d64ad6b4](https://github.com/user-attachments/assets/0c8fc80d-7c30-4e63-9653-08a118147922) 我的WX @WyHy