pdf2docx icon indicating copy to clipboard operation
pdf2docx copied to clipboard

运行提示Ignore Line "<image>" due to overlap,转成docx文件后PDF中表格的文字存在遗失

Open xxentropy opened this issue 3 years ago • 4 comments

您好,我在运行PDF转docx文件时,提示 mupdf: expected object number [INFO] Start to convert 581529124_40000_5783.pdf [INFO] [1/4] Opening document... [INFO] [2/4] Analyzing document... [WARNING] Ignore hidden text checking due to UnicodeDecodeError in upstream library. [WARNING] Ignore Line "" due to overlap [WARNING] Ignore Line "" due to overlap [WARNING] Ignore Line "" due to overlap [WARNING] Ignore Line "" due to overlap [WARNING] Ignore Line "" due to overlap [WARNING] Ignore Line "" due to overlap [WARNING] Ignore Line "" due to overlap [WARNING] Ignore Line "" due to overlap [WARNING] Ignore Line "" due to overlap [WARNING] Ignore Line "" due to overlap [WARNING] Ignore Line "" due to overlap

转换后的docx文件表格中的文字存在缺失 原PDF为 image 转换后为 image

请问该怎么改进呢

xxentropy avatar Dec 09 '22 08:12 xxentropy

抱歉这么久才处理。请问方便提供截图的那一页PDF吗?谢谢。

dothinking avatar Apr 05 '23 07:04 dothinking

抱歉这么久才处理。请问方便提供截图的那一页PDF吗?谢谢。

581550066_50001_2252.pdf 您好,原PDF找不到了,这个PDF有相似的问题,感谢大佬

xxentropy avatar Apr 14 '23 06:04 xxentropy

请问解决了吗?有同样的问题

Nancis1130 avatar Jun 20 '23 09:06 Nancis1130

遇到了同样的问题,使用此网站转换没有出现这个问题 image

MiRaIOMeZaSu avatar Nov 26 '23 12:11 MiRaIOMeZaSu