PyMuPDF issues

Method get_pixmap() make the program exit without any exceptions or messages

9

### Description of the bug When using page.get_pixmap() method, the program simply exits without any prompts(Both in Windows and Ubuntu) and cannot catch the exception. ### How to reproduce the...

Jianxinzhu-pact

upstream bug

fix developed

extend `Document.getitem` type annotation to reflect that the method also accepts slices

**Problem** The type annotation of `Document.__getitem__` is wrong: ```python def __getitem__(self, i: int =0): if isinstance(i, slice): ``` The type annotation of `i` requires it to be an `int`, but...

Sorontik

enhancement

Fixed in next release

PyMuPDF inserting newline characters mid-word

3

### Description of the bug In some cases PyMuPDF is adding newline characters in the middle of words which do no exist if you simply copy/paste the text from the...

brandenkmurray

enhancement

Document.select() behaves weirdly in some particular kind of pdf files

7

### Description of the bug Document.select() is not working in some particular kind of pdf files. I want to extract text from pdf files. If pdf has >30 pages then...

urvisism

upstream bug

fix developed

Update Document to check the /XYZ len

3

The issue arises because some PDFs return `/XYZ` coordinates in the format `/XYZ x y` instead of `/XYZ x y z`. This discrepancy causes the code to fail when attempting...

tamdao

Piximap program crash

3

### Description of the bug What is happening is that when I read from the PDF, I use the rectangle information to collect color data. Recently, however, I encountered an...

JordanGarske

bug

fix developed

src/init.py tests/: avoid segv from fz_samples_get() with empty p…

…ixmap. Pixmap.color_count(): don't raise exception if JM_color_count() returns empty dict. _read_samples(): return empty list if pixmap has no samples - avoids segv from fz_samples_get(). Addresses #3848.

julian-smith-artifex-com

PyMuPDF Pro cannot extract Chinese content from DOC and DOCX files

### Description of the bug The module can only extract numeric or English content and does not support Chinese. ### How to reproduce the bug Code Sample ``` import pymupdf.pro...

maxyou2090

font.valid_codepoints() - malfunction

2

### Description of the bug [font.valid_codepoints()](https://pymupdf.readthedocs.io/en/latest/font.html#Font.valid_codepoints) has stopped working correctly on the latest version. ### How to reproduce the bug #### code + sample pdf [font_valid_codepoints.zip](https://github.com/user-attachments/files/17328819/font_valid_codepoints.zip) #### latest version -...

pranctco

fix developed

Update Documentation

2

_delXmlMetadata no longer exists and appears to have been replaced by del_xml_metadata.

SamPetherbridge

PyMuPDF
PyMuPDF copied to clipboard

Metadata

Method get_pixmap() make the program exit without any exceptions or messages

extend `Document.getitem` type annotation to reflect that the method also accepts slices

PyMuPDF inserting newline characters mid-word

Document.select() behaves weirdly in some particular kind of pdf files

Update Document to check the /XYZ len

Piximap program crash

src/init.py tests/: avoid segv from fz_samples_get() with empty p…

PyMuPDF Pro cannot extract Chinese content from DOC and DOCX files

font.valid_codepoints() - malfunction

Update Documentation

← Metadata

Owner

Metadata

PyMuPDF PyMuPDF copied to clipboard

Metadata

← Metadata

Owner

Metadata

PyMuPDF
PyMuPDF copied to clipboard