Render PDF from multiple HTML files
Hi all, is it possible to render a PDF from multiple input files/strings, like in this example from flyingsaucer?
I had a problem with flyingsaucer, it throws something like "Page 21 was requested but document has only 20 pages"... 😞
I could set a baseURL for every page, on the setDocumentFromString method.
With openhtmltopdf I should concatenate all the HTMLs into one, right? But the base folder is not the same for all of them... 😄 In some of them it is "../styles/main.css" but in some other "../../styles/main.css" (deeper folder)
This will work, sort of. The only problem is the counter(page) and counter(pages) will be wrong on subsequent documents. I'll try to get these working again in the fast renderer that I'm working on as part of #180.
//214
public static void main(String...args) throws Exception {
String[] uris = new String[] {
"file:///Users/me/Documents/pdf-issues/issue-206.htm",
"file:///Users/me/Documents/pdf-issues/issue-180-p.htm"
};
PDDocument doc = new PDDocument();
for (String uri : uris) {
PdfRendererBuilder builder = new PdfRendererBuilder();
builder.withUri(uri);
builder.usePDDocument(doc);
PdfBoxRenderer renderer = builder.buildPdfRenderer();
renderer.createPDFWithoutClosing();
renderer.close();
}
OutputStream os = new FileOutputStream("/Users/me/Documents/pdf-issues/output/mytest-214.pdf");
doc.save(os);
os.close();
}
Thanks @danfickle, it works!!! 😄
Although, I have some issues with characters like č,ć etc. They get turned into # character.
Also, code highlighting with http://prismjs.com/ isn't working same like in the browser.
Seems like it "sees" just the <code> markup, not the prismjs goodies that get inserted after parsing the HTML.
Hi @sake92
For the characters, you will still have to embed a valid font for most languages other than English. See the template author's guide on the readme for tips.
In regard, to the prismjs, this project doesn't run Javascript, so you would probably need to find a Java syntax highlighter (see link below) or somehow get prismjs running in a Javascript runner available from Java (Nashorn or Rhino).
https://stackoverflow.com/questions/1853419/syntax-highlighter-for-java
@danfickle as far as I'm concerned, you can close this issue. Proposed solution works! 😌
If someone is interested, here is my implementation, from my static site generator: https://github.com/sake92/hepek/blob/master/src/main/scala/ba/sake/hepek/pdf/PdfGenerator.scala
I used Selenium ChromeDriver to wait for JS to load etc.
Example of PDF with some math: https://blog.sake.ba/pdfs/Matematika.pdf
Dear @danfickle, how I can generate one PDF file with 2 pages from 2 html templates?
I strongly recommend combining the templates if possible. Other than that, the sample above in this thread should work. What are you having trouble with?
Thank you for quick reply. Yes, I've used the example above but with some modifications:
try (OutputStream os = new FileOutputStream(filePath)) {
PDDocument doc = new PDDocument();
for (String html : htmlPagesWithValues) {
PdfRendererBuilder builder = new PdfRendererBuilder();
builder.defaultTextDirection(BaseRendererBuilder.TextDirection.LTR);
builder.useDefaultPageSize(210, 297, BaseRendererBuilder.PageSizeUnits.MM);
builder.useProtocolsStreamImplementation(new InternalFSStreamFactory(), "localProtocol");
builder.withHtmlContent(html, "");
builder.useSVGDrawer(new BatikSVGDrawer());
builder.usePDDocument(doc);
PdfBoxRenderer renderer = builder.buildPdfRenderer();
renderer.createPDFWithoutClosing();
}
doc.save(os);
} catch (Exception ex) {
}
And it works but without renderer.close(); line. With this line I am getting the following error: "Format error: Not a PDF or corrupted" during opening the PDF file and file size = 0 KB as well.
"I strongly recommend combining the templates" - do you mean concatenate the two html templates?
It works for me too. I implemented the integration of multiple processed templates (Thymeleaf) with once PDF file.
On the Kotlin:
fun convertHtmlToPdf(val processedTemplateFiles, outputStream: OutputStream) {
val doc = PDDocument()
val builder = PdfRendererBuilder()
for (processedTemplateContent in processedTemplateFiles) {
builder.useFastMode()
builder.withHtmlContent(processedTemplateContent, resourcesBaseUri)
builder.usePDDocument(doc)
val buildPdfRenderer = builder.buildPdfRenderer()
buildPdfRenderer.layout()
buildPdfRenderer.createPDFWithoutClosing()
buildPdfRenderer.close()
}
doc.save(outputStream)
}
But if you have some special logic for all of these files, it will be hard to implement, for example, if you need common order with page numbers.
public static void main(String...args) throws Exception { String[] uris = new String[] { "file:///Users/me/Documents/pdf-issues/issue-206.htm", "file:///Users/me/Documents/pdf-issues/issue-180-p.htm" };
PDDocument doc = new PDDocument();
for (String uri : uris) {
PdfRendererBuilder builder = new PdfRendererBuilder();
builder.withUri(uri);
builder.usePDDocument(doc);
PdfBoxRenderer renderer = builder.buildPdfRenderer();
renderer.createPDFWithoutClosing();
renderer.close();
}
OutputStream os = new FileOutputStream("/Users/me/Documents/pdf-issues/output/mytest-214.pdf");
doc.save(os);
os.close();
}
Does doc(PDDocument instance) need to call close method?
Hi @danfickle , the code you provided works for me to generate PDF from multiple HTML files, but it is adding each HTML as a new page and not starting from where previous page ended. Can you please guide me?