pdf-reader icon indicating copy to clipboard operation
pdf-reader copied to clipboard

Issue with PDFs containing Arabic script/RTL script

Open florisre opened this issue 3 years ago • 0 comments

Current behavior:

The text is not selected where it is in the document. Click & pull to select results in the following selection: Current behavior Right-clicking the selection and copying it to the clipboard results in the following output:

د ه د ا ب ش ب ه ا ی ش ا ع ر ا ن و ن و ی س ن د گ ا ن د ر ا ن ج م ن ف ر ه س گ ی ا ب ر ا ن ۹ آ ل م ا

Correct behavior:

Chromium's pdfium (I hope that is actually what's displaying PDFs in Chroium), and thus all Chromium-based browsers I have tried, do handle this correctly: Chromium's behavior The selected text copies correctly as:


ده د
شبهای شاعران ونویسندگان اب
درانجمن فرهسگی ابران ۹آلمان 

Bigger scope

This issue is prominent and related to how RTL-documents are handled in PDF standards. Also see this contribution over at Adobe community and this discussion of the issue over at tesseract.

For further evaluation, I have attached the first page of the document shown in the screenshots here: https://bwsyncandshare.kit.edu/s/ZwQ7zyWXmKLHpdH

florisre avatar Apr 24 '23 16:04 florisre