PyPDF4
PyPDF4 copied to clipboard
Merge pdf/a to combined pdf/a
When PdfFileMerger merges pdf/a files, it loses pdf/a information and resets the PDF Version to 1.3.
Example pdf/a information:
<x:xmpmeta xmlns:x="adobe:ns:meta/">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about=""
xmlns:pdfaid="http://www.aiim.org/pdfa/ns/id/">
<pdfaid:part>1</pdfaid:part>
<pdfaid:conformance>A</pdfaid:conformance>
</rdf:Description>
<rdf:Description rdf:about=""
xmlns:pdf="http://ns.adobe.com/pdf/1.3/">
<pdf:Producer>LibreOffice 6.1</pdf:Producer>
</rdf:Description>
<rdf:Description rdf:about=""
xmlns:xmp="http://ns.adobe.com/xap/1.0/">
<xmp:CreatorTool>Draw</xmp:CreatorTool>
<xmp:CreateDate>2019-04-03T06:18:04Z</xmp:CreateDate>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
pdf/a is a standard for long-term preservation in digital archives. The more general case would be support for xmpmeta information:
https://github.com/mstamy2/PyPDF2/issues/492
So people could at least add pdf/a metadata if they are convinced that their output conforms to pdf/a.
In pdf/a a number of things are disallowed or mandatory (depending on the pdf/a profile), and there are validation tools like verapdf to check if a pdf conforms to pdf/a. A long term goal might be to support creation of proper pdf/a, manipulating a pdf to make it conformant.