pdfcpu icon indicating copy to clipboard operation
pdfcpu copied to clipboard

fill form using japanese font

Open ka2n opened this issue 2 years ago • 7 comments

I have set up a text field with a Japanese font. When I try to insert text here, the font is recognized, but I get the error pdfcpu: corrupt fontDic.

After some investigation, the code below cause this error.

https://github.com/pdfcpu/pdfcpu/blob/b89d7b1ab108d54a983c9653d47df07bbad5336e/pkg/pdfcpu/font/fontDict.go#L117-L120

I'm not familiar with PDF specifications, but it seems like the problem is accessing 7 instead of 8. Maybe there is a problem if the subsetted font for the embedded text and the font set on the form are the same?

162
    7:   offset=    1034 generation=0 types.Dict type=Font subType=Type0
<<
        <BaseFont, BAAAAA+IPAmjMincho>
        <DescendantFonts, [(8 0 R)]>
        <Encoding, Identity-H>
        <Subtype, Type0>
        <ToUnicode, (9 0 R)>
        <Type, Font>
>>
    8:   offset=    1179 generation=0 types.Dict type=Font subType=CIDFontType2
<<
        <BaseFont, BAAAAA+IPAmjMincho>
        <CIDSystemInfo, <<
                <Ordering, (Identity)>
                <Registry, (Adobe)>
                <Supplement, 0>
        >>>
        <CIDToGIDMap, (14 0 R)>
        <FontDescriptor, (11 0 R)>
        <Subtype, CIDFontType2>
        <Type, Font>
        <W, [0 [1000 1000 1000 1000 1000 1000 1000 1000 1000 290]]>
>>
$ pdfcpu font list
Corefonts:
  Courier
  Courier-Bold
  Courier-BoldOblique
  Courier-Oblique
  Helvetica
  Helvetica-Bold
  Helvetica-BoldOblique
  Helvetica-Oblique
  Symbol
  Times-Bold
  Times-BoldItalic
  Times-Italic
  Times-Roman
  ZapfDingbats

Userfonts(/home/k2/.config/pdfcpu/fonts):
  IPAmjMincho (61360 glyphs)
  • State your OS and OS version
Linux pc 6.7.0-zen3-1-zen #1 ZEN SMP PREEMPT_DYNAMIC Sat, 13 Jan 2024 14:36:54 +0000 x86_64 GNU/Linux

By the way, I produce the PDF file with forms with ONLYOFFICE Desktop Editors. The file contain DA parameter with the value like 0.000000 0.000000 0.000000 rg \057F2 20.000000 Tf, so I have to patch parsing code to convert \057 to \.

ka2n avatar Jan 18 '24 07:01 ka2n

Update: By changing the body text font and form font, this error no longer occurs but filled form texts are garbled.

ka2n avatar Jan 18 '24 08:01 ka2n

Hello!

Thanks for reporting this.

Do you think you can share a small sample PDF file and the font you are using?

This would really help.

Thank you for using pdfcpu 💚

hhrutter avatar Jan 19 '24 23:01 hhrutter

Thank you!

  • PDF: ipaexmincho_onlyoffice.pdf
  • Font: https://moji.or.jp/ipafont/ipaex00401/ (please download ipaexm00401.zip(5.3MB))
    • Please note I changed embedded font because of size limitation of github. ( IPAmjMincho -> IPAexMincho)

and a patch for my generator issue.

ka2n avatar Jan 20 '24 15:01 ka2n

Thanks!!

hhrutter avatar Jan 22 '24 00:01 hhrutter

Please describe your use case.

Also do you think you can share a conrete sample of what you are doing?

If not, what would really help is a short Japanese text for testing with this font.

hhrutter avatar Feb 06 '24 14:02 hhrutter

The latest commit takes care of the DA string issue.

hhrutter avatar Feb 06 '24 16:02 hhrutter