PyMuPDF icon indicating copy to clipboard operation
PyMuPDF copied to clipboard

page.get_pixmap() fails due to `fitz.mupdf.FzErrorLimit: code=5: too many nested graphics states`

Open Luux opened this issue 1 year ago • 5 comments

Description of the bug

Trying to get the pixmap of certain pdf documents fails:

  File "/home/.../test.py", line 11, in <module>
    _ = page.get_pixmap()
        ^^^^^^^^^^^^^^^^^
  File "/home/.../miniconda3/lib/python3.12/site-packages/fitz/utils.py", line 888, in get_pixmap
    dl = page.get_displaylist(annots=annots)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/.../miniconda3/lib/python3.12/site-packages/fitz/__init__.py", line 8768, in get_displaylist
    dl = mupdf.fz_new_display_list_from_page(self.this)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/.../miniconda3/lib/python3.12/site-packages/fitz/mupdf.py", line 42912, in fz_new_display_list_from_page
    return _mupdf.fz_new_display_list_from_page(page)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
fitz.mupdf.FzErrorLimit: code=5: too many nested graphics states

How to reproduce the bug

with fitz.open(pdf_file) as doc:
    page, *_ = doc.pages()
    _ = page.get_pixmap()

with latest pymupdf version. The same constellation runs just fine with at least pymupdf==1.22.5 . Unfortunately, I cannot provide affected files, but it seems to hit some hardcoded recursion limit - maybe a parameter to configure this limit would be enough?

PyMuPDF version

1.24.5

Operating system

Linux

Python version

3.12

Luux avatar Jun 24 '24 12:06 Luux

Sorry, but we cannot accept bug reports without any material required to reproduce it! You can use my e-mail address to avoid publishing sensitive information here.

JorjMcKie avatar Jun 24 '24 12:06 JorjMcKie

The same file worked just fine with earlier versions of PyMuPDF (~although I don't know exactly the version, we are looking for it ...~), e.g. in 1.22.1. Plus, we are very sure that the changed behavior actually lies upstream because mupdf-gl also refuses to render the page, which worked fine earlier. Also muraster produces a several MB large completely white output image. Would it help if we narrowed down the version that introduced the changed behavior? Shall we open an Issue with upstream MuPDF?

griai avatar Jun 24 '24 12:06 griai

The same file worked just fine with earlier versions of PyMuPDF (~although I don't know exactly the version, we are looking for it ...~), e.g. in 1.22.1. Plus, we are very sure that the changed behavior actually lies upstream because mupdf-gl also refuses to render the page, which worked fine earlier. Also muraster produces a several MB large completely white output image. Would it help if we narrowed down the version that introduced the changed behavior? Shall we open an Issue with upstream MuPDF?

Well in that case you may be better off to directly communicate with the MuPDF team. Their own issue tracking is located here: https://bugs.ghostscript.com/enter_bug.cgi

JorjMcKie avatar Jun 24 '24 13:06 JorjMcKie

I created a ticket for mupdf: https://bugs.ghostscript.com/show_bug.cgi?id=707842

Luux avatar Jun 25 '24 10:06 Luux

Thank you for the information!

JorjMcKie avatar Jun 25 '24 10:06 JorjMcKie

The MuPDF team has determined that this problem cannot be resolved - see comment in https://bugs.ghostscript.com/show_bug.cgi?id=707842.

JorjMcKie avatar Aug 16 '24 07:08 JorjMcKie

Just adding a note here to clarify that the reason that the MuPDF people cannot fix this, is that the input file has not been provided. It's not possible to investigate the problem without this input file.

As @JorjMcKie mentioned above, we can (and often do) use files emailed directly to us. Such files are never made public. So please do this if you can.

We hit the same issue 😞 . I checked all versions and it worked with pymupdf=1.23.26, it starts crashing in 1.24.0.

felixxm avatar Aug 23 '24 06:08 felixxm