Change handling of text
Now that the libHaru version of the code has been released, we would welcome assistance to change the handling of ASCII characters read from the input file.
The idea is to replace the printcharx() function in the code, so that the captured ASCII characters are written direct to the PDF file. This has the following advantages:
-
We can use the built-in PDF fonts (or freely downloadable fonts from the internet) to more accurately represent the original Epson font set - although we still need to be able to cope with user defined characters, so those would still need to be drawn using a similar routine to the existing printcharx()
-
The final PDF will contain much higher quality printouts as it will not be reliant on bitmap fonts
-
Hopefully, the processing time will be reduced, as the PNG image (containing any bitmap graphics and the user defined characters) will be substantially less complicated where there is a mixture of text and graphics on a page.
Thankfully - it looks fairly simple - see https://github.com/libharu/libharu/wiki/API:-Graphics#HPDF_Page_TextOut
If we use the HPDF_Page_TextRext() function, we can also support printer justification of the text if required (albeit based on a single line of text)
The main issues around this are:
a) Handling of the various EPSON character sets and fonts which contain many non-standard characters b) How to handle double height (but single width text) and double-width (but single height) text.
Another benefit you didn't list: would be searchable documents... which would lead to web-crawler indexed documents... which would lead to internet search in content of old documents.