ComboBox choice_values full of empty strings despite PDF having valid choices.
Description of the bug
I am using the 940b: https://www.irs.gov/pub/irs-pdf/f940b.pdf
The PDF file has identical pages, and each page has this specific dropdown:
The choice_values variable is empty.
import pymupdf
pdf = pymupdf.open('f940b.pdf')
for page in pdf:
for widget in page.widgets():
if widget.field_type_string == 'ComboBox':
print(widget.choice_values)
widget.update()
pdf.save('f940b-output.pdf')
Expected output:
[' - Select One - ', ' ', 'Cincinnati, OH 45999', 'Memphis, TN 37501', 'Ogden, UT 84201', 'Philadelphia, PA 19255']
[' - Select One - ', ' ', 'Cincinnati, OH 45999', 'Memphis, TN 37501', 'Ogden, UT 84201', 'Philadelphia, PA 19255']
Actual output:
['', '', '', '', '', '']
[' - Select One - ', ' ', 'Cincinnati, OH 45999', 'Memphis, TN 37501', 'Ogden, UT 84201', 'Philadelphia, PA 19255']
This also affects the resulting f940b-output.pdf, where the first combo box is suddenly completely empty with no choices available.
How to reproduce the bug
See above
PyMuPDF version
1.24.13
Operating system
Linux
Python version
3.12
<<
/Rect [ 213.206 341.196 391.491 361.361 ]
/Subtype /Widget
/Parent 86 0 R
/F 4
/P 505 0 R
/StructParent 42
/Type /Annot
/MK <<
/BG [ 1 ]
>>
/AP <<
/N 533 0 R
>>
>>
<<
/Rect [ 213.206 341.196 391.491 361.361 ]
/Subtype /Widget
/TU (Mail to:)
/Parent 86 0 R
/F 4
/I 47 0 R
/P 1 0 R
/StructParent 62
/V 36 0 R
/DA (/Helv 12 Tf 0 g)
/DV 51 0 R
/Opt 52 0 R
/Type /Annot
/Ff 4325376
/MK <<
/BG [ 1 ]
>>
/AP <<
/N 45 0 R
>>
>>
It seems that in this form, the dropdown on the first page has no /Opt key, only the one on the second page. Yet, in all PDF viewers, the options are shown in both dropdowns as expected. What other key is being used to link to these choices?
These forms both have a /Parent 86 xref that links to
<<
/TU (Mail to:)
/I 87 0 R
/T (p1-t14)
/V 88 0 R
/DA (/Helv 12 Tf 0 g)
/DV 89 0 R
/Opt 90 0 R
/FT /Ch
/Ff 4325376
/Kids [ 532 0 R 46 0 R ]
>>
And /Opt 90 links to the correct list of options:
[ ( - Select One - ) ( ) (Cincinnati, OH 45999) (Memphis, TN 37501)
(Ogden, UT 84201) (Philadelphia, PA 19255) ]
So somewhere in pymupdf you need to account for the fact that the /Opt key might live in the /Parent object.
Also, widget.field_value = widget.choice_values[foo] doesn't even work. (Leaves the field in the output PDF completely blank)