pdf2htmlEX
pdf2htmlEX copied to clipboard
Random characters are replaced with character (U+E61F)
When converting a pdf, some characters are replaced with the "" character. You won't see this visually, because the character is represented in the generated font. I added an example zip with the PDF and HTML. biological-psychology_compress-pages-2.zip
Try using --decompose-ligature 1
This will not always work, but I suspect that it can resolve the problem in your case. Be careful however, since the ligature fl (one character) gets decomposed into "fl" (two characters), the corresponding glyphs might be missing in the font. There is a solution to that, but you will probably be alright. Ask here, if it doesn't work.