Converted HTML returns blank html

Open Ekansh19 opened this issue 3 years ago • 1 comments

Hi,

After converting the pdf to HTML, am getting the same HTML code against all the files(different) and with almost blank body data. jpg2pdf.pdf

Result HTML: Screenshot 2022-05-27 at 3 19 38 PM

May 27 '22 09:05 Ekansh19

This lib cannot get or extract images (like figures or graphs) from pdf, but you can create an image (thumbnail) from whole page:

const options = { page: 1, imageType: 'png', width: 160, height: 226 }
pdf2html.thumbnail('sample.pdf', options, (err, thumbnailPath) => {
    if (err) {
        console.error('Conversion error: ' + err)
    } else {
        console.log(thumbnailPath)
    }
})

For more advanced manipulations use node-poppler or Mozilla's pdf.js

Jul 04 '22 09:07 reregaga

Like @reregaga mentioned. this library doesn't extract images. Please feel free to do PR if you would like to add this.

Jan 22 '23 03:01 shebinleo