BookStack icon indicating copy to clipboard operation
BookStack copied to clipboard

Error rendering some UTF8 characters on PDF export

Open junlicn opened this issue 7 years ago • 18 comments

For Feature Requests

Desired Feature:

For Bug Reports

  • BookStack Version (Found in settings, Please don't put 'latest'):
  • PHP Version:
  • MySQL Version: Ubuntu 16.04 Installation Script 2018/2/27 ubuntu default.
Expected Behavior

image

Current Behavior

image

Steps to Reproduce

after installation, login as admin, write chinese , then export from webpages.

junlicn avatar Feb 27 '18 06:02 junlicn

@junlicn Thanks for reporting. Issue confirmed and marked to be fixed for next release.

ssddanbrown avatar Mar 05 '18 20:03 ssddanbrown

Updating the title of this so mulitple issues of the same cause can be merged down into this one.

ssddanbrown avatar Mar 17 '18 17:03 ssddanbrown

So I've had deeper dive into this. Unfortunately supporting all languages looks like it's going to be tricky and I don't really want to include many MB's of font files in the code.

As mentioned within #730 and #746, Using the WKHTML instead of the default DOMPDF to render PDF's will allow much easier support of fonts in addition to better rendering overall. Details of changing to WKHTML can be found in the docs here.

I have made a small tweak in b42b07179fb95b0ffdc370a0e8fa1136b97be364 which should increase default support but I know it doesn't cover Japanese/Chinese characters.

ssddanbrown avatar Mar 17 '18 17:03 ssddanbrown

Does not work.  i have installed wkhtmltopdf and added it via .env file / correct path and all I get is this: obrazok

obrazok

nekromoff avatar Mar 26 '18 12:03 nekromoff

Also, this bug applies to extended set of ASCII - e.g. Central European characters such as ľščťžýáíéúô etc.

nekromoff avatar Mar 26 '18 12:03 nekromoff

Based on this, it is just an issue of a selected font: https://stackoverflow.com/questions/16384517/dompdf-character-encoding-utf-8

nekromoff avatar Mar 26 '18 12:03 nekromoff

Basically, this fixes the bug for extended ASCII - European characters: setting all fonts in /vendor/dompdf/dompdf/lib/fonts/dompdf_font_family_cache.dist.php to:

DejaVuSans
DejaVuSans-Bold

etc.

nekromoff avatar Mar 26 '18 12:03 nekromoff

@nekromoff Yeah, WKHTML can be tricky to set up correctly.

In the latest update, Released yesterday, There have been some updates made to DOMPDF exports to increase support but there's a limit to how much we can support without including a large amounts of fonts.

ssddanbrown avatar Mar 26 '18 13:03 ssddanbrown

I think this should be the font setting problem, such as the default ubuntu font "DejaVuSans", which is not Chinese.

jasoncheng7115 avatar Mar 31 '18 01:03 jasoncheng7115

For those who've ended up here to find how to resolve non-English characters missing problem on exported PDF file, try these steps. (I was able to have my Korean page properly exported using wkhtmltopdf.)

  1. Make sure Bookstack uses fonts that supports your language by changing Custom HTML Head Content settings.
  • For example, I used IBM Plex Sans KR.
    <link href='https://fonts.googleapis.com/css?family=IBM+Plex+Sans+KR' rel='stylesheet' type='text/css'>
    <style>
      body, button, input, select, label, textarea {
        font-family: 'IBM Plex Sans', sans-serif;
      }
    </style>
    
  1. Get a font file that you've set as font-family, and save it to system's default font directory.
  • To make sure if it's installed properly, run fc-list and check if the font is on the list.

To save your time, I recommend using Google Web Fonts Helper to download font files.

lkaybob avatar Sep 09 '21 09:09 lkaybob

For those who've ended up here to find how to resolve non-English characters missing problem on exported PDF file, try these steps. (I was able to have my Korean page properly exported using wkhtmltopdf.)

  1. Make sure Bookstack uses fonts that supports your language by changing Custom HTML Head Content settings.
  • For example, I used IBM Plex Sans KR.
    <link href='https://fonts.googleapis.com/css?family=IBM+Plex+Sans+KR' rel='stylesheet' type='text/css'>
    <style>
      body, button, input, select, label, textarea {
        font-family: 'IBM Plex Sans', sans-serif;
      }
    </style>
    
  1. Get a font file that you've set as font-family, and save it to system's default font directory.
  • To make sure if it's installed properly, run fc-list and check if the font is on the list.

To save your time, I recommend using Google Web Fonts Helper to download font files.

This worked for me, Thank you!!!! and don't forget fc-cache -fv

akahan989 avatar Mar 07 '23 12:03 akahan989

I have the same issue. I tried to export my page to PDF. My bookstack works in a docker container with standart volumes. All of h2 and bigger text in russian switches to "?" signs, but in the same PDF usual text is displayed perfectly fine.

Image

I tried to do something like that

<link href='https://fonts.googleapis.com/css?family=DejaVu+Sans&subset=cyrillic,cyrillic-ext' rel='stylesheet' type='text/css'>
<style>
  body, button, input, select, label, textarea {
    font-family: 'DejaVu Sans', sans-serif;
  }
</style>

with cirylicc font but id didn't work. I tried to copy my fonts into the contaner and gave acces to them via nginx with font-face in "Custom headers" and it didn't word either. My font is seen in fc-list, but bookstack starts to "?" even the usual text when i try to do anything in "custom headers"

Akiseev avatar Jun 09 '25 13:06 Akiseev

My bookstack works in a docker container with standard volumes.

Do you have or have access to the Russian fonts that Docker invokes? The PDF rendering engine that Bookstack invokes when outputting PDFs must be an engine that can invoke multiple fonts, and the fonts must exist on the machine that the engine accesses.

  • https://www.bookstackapp.com/docs/admin/pdf-rendering/

And, in my case, only the Wkhtmltopdf engine was capable of outputting Korean fonts.

fofwisdom avatar Jun 10 '25 00:06 fofwisdom

only the Wkhtmltopdf engine was capable of outputting Korean fonts

I think my issue is there. I use linuxserver image, which uses Alpine as it's OS. And it seems to me that Wkhtmltopdf is just not supported by it. What can i do ? I could build full another image with different OS myself but it seems realy excessive to do all that for export to pdf error

Akiseev avatar Jun 10 '25 07:06 Akiseev

@Akiseev The Wkhtmltopdf option is somewhat considered deprecated now, You could try using the PDF Export Command option? I added this as a new flexible replacement for the wkhtmltopdf option.

https://www.bookstackapp.com/docs/admin/pdf-rendering/#pdf-export-command

ssddanbrown avatar Jun 10 '25 22:06 ssddanbrown

Can i just add weasyprint into my docker-compose and use it to export pdf from bookstack_app container ?

Akiseev avatar Jun 11 '25 11:06 Akiseev

@Akiseev potentially. Will need to configure BookStack as per the docs, and might need to consider the networking if using the shown docker setup & merging that into an existing composer. The networking has been carefully setup to prevent outward network access for the weasyprint container in the example. If you don't care about security though (eg. trusted users only, low risk scenario) then you could just call the weasyprint container directly and avoid a lot of the networking and the extra proxy container.

ssddanbrown avatar Jun 13 '25 14:06 ssddanbrown

Thanks for answers ! For me, i created my own docker image FROM linuxserver/bookstack, apk lets you install weasyprint as a package pretty easy.

FROM lscr.io/linuxserver/bookstack

RUN apk update ; \
    apk upgrade ; \
    apk add weasyprint ; \
    rm -rf /var/cache/apk/* 

And i added EXPORT_PDF_COMMAND="weasyprint {input_html_path} {output_pdf_path}", like in docs and it worked

Image

Akiseev avatar Jun 16 '25 08:06 Akiseev