NOEXIST comments

Results 17 comments of


                                            NOEXIST

关于多栏布局/版面分析的一些探讨

> > 各位开发者好，我是 [Umi-OCR](https://github.com/hiroi-sora/Umi-OCR) 的作者。 > > Umi-OCR 是一个开源的OCR软件，目前正在开发PDF扫描件识别的功能。其中的一个难点在于，OCR得到的文本块的顺序，往往与实际阅读顺序不符合，特别是在多栏布局的文档中。我需要根据文档的排版，正确区分出不同列，按实际阅读顺序为文本块进行排序。 > > pdf2docx 中也涉及一些基于规则的排版解析功能。我浅读了部分代码，这给了我一些启发。 > > 最终，我设计出一个新算法： [GapTree_Sort 间隙树排序法](https://github.com/hiroi-sora/GapTree_Sort_Algorithm) 。它通过寻找文本块之间的间隙，将页面切割为不同的纵向区块，构建出布局树。最后，前序遍历布局树，即可得到符合人类阅读习惯的文本排序。 > > 当然，除了排序文本块，也能通过布局树分析更多排版信息。（不过它不是针对PDF设计的，没有考虑块对象本身附带的标签等信息。） > > pdf2docx 当前的规则匹配，只支持最多2栏、且列宽不能相差太大。 > > 而 GapTree_Sort...

遇到无法处理的图片会导致整个程序崩溃

> ![截屏2023-11-17 上午11 47 09](https://private-user-images.githubusercontent.com/44791928/283665911-be0a35b6-dee1-46fe-ae2c-2b5e99f81b96.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTEiLCJleHAiOjE3MDE2NzQwMDAsIm5iZiI6MTcwMTY3MzcwMCwicGF0aCI6Ii80NDc5MTkyOC8yODM2NjU5MTEtYmUwYTM1YjYtZGVlMS00NmZlLWFlMmMtMmI1ZTk5ZjgxYjk2LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFJV05KWUFYNENTVkVINTNBJTJGMjAyMzEyMDQlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjMxMjA0VDA3MDgyMFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWE0NmQyMGU4MWQ4OWRkZjViMGRlN2VkNTVmYjVkOTFlZjI1ODMyMmRjYTI1MDFiZTg1ZmE2NzEyMjJhYmE0MjYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.6d-R02jmpj-d4cLDphiUCYAbrvAe6SGRkiJ1H5eGNeo) 可以先临时异常处理一下我发现可以用pil_tobytes 但是不确定完不完美 ![image](https://github.com/dothinking/pdf2docx/assets/37769332/ed8b3911-ee5d-4de1-ac3e-8e4fb2509958)

作者你好，pdf转word后图片还是翻转状态

#298 试试这个pr的代码，手动改改对应库代码

win32 version have some bug?

> Sorry for being late, you can simply ignore that messages as they're just warnings in the x64 version, it's working for me tho :/ Warning is another matter,This test...

win32 version have some bug?

> Uh I tried with the [latest version](https://github.com/sudo-nautilus/FFmpeg-Builds-Win32/releases/download/latest/ffmpeg-master-latest-win32-gpl.zip), > > 2022-07-18.11-05-40.mp4 I don't know what happened. I tried the new [version.](https://github.com/sudo-nautilus/FFmpeg-Builds-Win32/releases/download/latest/ffmpeg-master-latest-win32-gpl.zip) The virtual machine test I tried to use win11...

win32 version have some bug?

> Uhh I too don't have any idea what's happening, anyway how did you manage to install a 32-bit win11 vm I used 64 bit system for testing. Maybe it's...

win32 version have some bug?

> > Uhh I too don't have any idea what's happening, anyway how did you manage to install a 32-bit win11 vm > > fwiw, there is no 32-bit Windows...

win32 version have some bug?

> > > Uhh I too don't have any idea what's happening, anyway how did you manage to install a 32-bit win11 vm > > > > > > fwiw,...

scrcpy not working in Huawei and Xiaomi devices

may be is mtk cpu can't support frame as encode size= phone size try to this: scrcpy -s 0e16808b0501 -V debug -p 21300:22300 -m 1920

表格生成的时候没有处理好浮动形图片

测试样本 [test.pdf](https://github.com/user-attachments/files/15926382/test.pdf)