pdf2json icon indicating copy to clipboard operation
pdf2json copied to clipboard

PDF downloaded through request unreadable. From file it is readable.

Open radboudp opened this issue 7 years ago • 3 comments

So, I am intend to use pdf2json to test my pdf generator service within cucumberjs. When I read the expected pdf from file I can parse the PDF. No problem. When I obtain the generated pdf from the service, it is not possible to parse the PDF. After some investigation I found the problem. The Buffer returned for the file has the same amount of bytes allocated as the number of bytes in the PDF. The Buffer created by the request lib to download the PDF from the service is larger then the number of bytes put into it. This seems to be a problem for pdf2json or the underlaying pdf parser:

    { parserError: 'An error occurred while parsing the PDF: bad XRef entry' }).

For file pdf:

    pdfBuffer.buffer:  ArrayBuffer { byteLength: 1004 }
    pdfBuffer.length:  1004

For downloaded pdf:

    pdfBuffer.buffer:  ArrayBuffer { byteLength: 8192 }
    pdfBuffer.length:  1004

I work around this problem by creating a new buffer of the correct length and copying the data into it. Then it works.

    let bufferNew = Buffer.alloc(pdfBuffer.length);
    pdfBuffer.copy(bufferNew);

It seems to me that the buffer is parsed too far...

radboudp avatar Jul 16 '18 09:07 radboudp

I ran into the exact same issue. @radboudp thanks for the workaround.

nettad avatar Feb 13 '19 13:02 nettad

Got exactly the same bug, work from a physical file, does't from a stream @radboudp thanks for the workaround.

jbdemonte avatar Apr 11 '19 15:04 jbdemonte

+1 thanks @radboudp

jonaskello avatar Aug 27 '21 18:08 jonaskello