PyMuPDF icon indicating copy to clipboard operation
PyMuPDF copied to clipboard

Incorrect handling of the link zoom parameter in link insertions

Open ikseek opened this issue 1 year ago • 3 comments

Description of the bug

When I merge two PDF files with reproducer code provided, I get merged.pdf files with some links not working anymore. To reproduce a bug, download basic-link-1.pdf and attachment-sample-1.pdf files and run provided script in a directory that contain them. Open produced merged.pdf.

Expected result:

  • merged pdf has all links working

Observed result:

  • links "Linking to an ID" and "Linking to a page number (page 2) and setting the display ratio (200%)" do not work in merged.pdf.

Checked in Mac OS Preview and Chromium PDF viewers.

basic-link-1.pdf attachment-sample-1.pdf merged.pdf

How to reproduce the bug

import fitz

out = fitz.open()
for file in "basic-link-1.pdf", "attachment-sample-1.pdf":
    out.insert_pdf(fitz.open(file))
out.save(filename="merged.pdf")

PyMuPDF version

1.24.1

Operating system

MacOS

Python version

3.12

ikseek avatar Apr 05 '24 23:04 ikseek

File 'basic-link-1.pdf' contains a names dictionary (structure in the PDF catalog). Document-wide information like the names dictionary is not copied to the target PDF in method .insert_pdf() because this is a page-based method. Named links in the source dictionary can thus not be copied - there is no internal link-kind-conversion like LINK_NAMED ==> LINK_GOTO. So "Linking to an ID" is bound to fail.

So the remaining issue is the incorrect handling of the zoom value. I therefore are taking the liberty to change the issue title accordingly.

JorjMcKie avatar Apr 06 '24 10:04 JorjMcKie

Thanks @JorjMcKie! Is there way to somehow convert this links manually via pymupdf?

ikseek avatar Apr 07 '24 02:04 ikseek

You can walk through the named links of a page. Their dictionary items should contain all information you need to turn them into LINK_GOTO items. That one named items is

link0= {'kind': 4,
  'xref': 24,
  'from': Rect(56.69292068481445, 215.346435546875, 123.62651062011719, 225.346435546875),
  'page': 1,
  'to': Point(0.0, 813.54336),
  'zoom': 0.0,
  'nameddest': 'Link-01',
  'id': ''}

So you could define

link1= {'kind': fitz.LINK_GOTO, 'from': link0["from"], link0["page"], 'to': link0["to"]}
page.delete_link(link0)
page.insert_link(link1)

JorjMcKie avatar Apr 07 '24 13:04 JorjMcKie