PDFsharp icon indicating copy to clipboard operation
PDFsharp copied to clipboard

Use lazy loading for object-streams and their objects

Open packdat opened this issue 1 year ago • 0 comments

This PR attempts to resolve the issues described in #73 and #46 in a more generic way. It also supersedes #53 by removing the need to handle objects stored in object-streams in a special way.

The "lazy loading" aspect is handled by the new class PdfReferenceToCompressedObject, which is a sub-class of PdfReference. While processing the document's xref-streams, references to objects stored in object-streams are collected in the form of the mentioned PdfReferenceToCompressedObject. When accessing the Value of such a reference (which may occur while parsing another object which contains a reference to the compressed object), the object-stream is loaded and decrypted (if not already done) and the actual object is read from the object-stream.

Have not found any issue so far running automated tests with these changes against ~1000 PDF-files (testing page-import).

Note: The PR also includes some minor tweaks not directly related to object-loading, which i think are helpful. (like reporting the position within a document where an unexpected token was encountered during parsing)

packdat avatar Feb 01 '24 21:02 packdat