pdf-lib icon indicating copy to clipboard operation
pdf-lib copied to clipboard

PDF parsed as invalid

Open theprogramsam opened this issue 1 year ago • 0 comments

What were you trying to do?

Open the file via fetch and then write to file with more content added to the pdf pages.

How did you attempt to do it?

const headers = new Headers();
headers.append("Access-Control-Allow-Origin", "https://my-aws-link.s3.us-west-2.amazonaws.com");

const pdfBytes = await fetch(pdf["presigned_link"], { headers: headers }).then(res => res.arrayBuffer());
const pdfDoc = await PDFLib.PDFDocument.load(pdfBytes);
const pdfDataUriOriginal = await pdfDoc.saveAsBase64({ dataUri: true });

Then added it to iframe doesn't work. In Ruby decoding the file:

# Orignal file
3.2.1 :017 > f = File.read(Rails.root.join("app", "public", "SOC2.pdf"))
 => "%PDF-1.7\r%\xE2\xE3\xCF\xD3\r\n1182 0 obj\r<</Linearized 1/L 2860508/O 1184/E 594904/N 4/T 2859044/H [ 510 260]>>\rendobj\r..."
# File from base64 saving of pdfDataUriOriginal
3.2.1 :012 > f = File.read(Rails.root.join("app", "public", "base64_pdf"))
 => "JVBERi0xLjcKJYGBgYEKCjIgMCBvYmoKPDwKL0xlbmd0aCA0Ngo+PgpzdHJlYW0KL0RldmljZVJHQiBDUwovRGV2aWNlUkdCIGNzCnEKL0UxIGdzCi9YMSBEbwp..."

3.2.1 :013 > de = Base64.decode64(f)
 => "%PDF-1.7\n%\x81\x81\x81\x81\n\n2 0 obj\n<<\n/Length 46\n>>\nstream\n/DeviceRGB CS\n/DeviceRGB cs\nq\n/E1 gs\n/X1 Do\nQ\n\ne..."

3.2.1 :014 > File.write(Rails.root.join("app", "public", "base64_pdf.pdf"), de)
(irb):14:in `write': "\x81" from ASCII-8BIT to UTF-8 (Encoding::UndefinedConversionError)

If I force encode it it saves, but is malformed.

File.write(Rails.root.join("app", "controllers", "e_signature", "base64_pdf.pdf"), de.force_encoding('ISO-8859-1').encode
('UTF-8'))

Screenshot 2025-01-08 at 11 48 06 PM

I have attached the file.

Screenshot 2025-01-08 at 11 53 53 PM

SOC2.pdf

What actually happened?

Screen goes blank on iframe for pdf. The Original file output vs decoded pdf-lib base 64 is different.

What did you expect to happen?

By simply just loading it through the library and then saving it without any modifications, it should display well in the browser and result in the same original bytes.

How can we reproduce the issue?

script src="https://unpkg.com/pdf-lib" script src="https://unpkg.com/@pdf-lib/fontkit/dist/fontkit.umd.min.js"

Version

whichever current version is hosted on https://unpkg.com/pdf-lib

What environment are you running pdf-lib in?

Browser

Checklist

  • [X] My report includes a Short, Self Contained, Correct (Compilable) Example.
  • [X] I have attached all PDFs, images, and other files needed to run my SSCCE.

Additional Notes

No response

theprogramsam avatar Jan 09 '25 07:01 theprogramsam