Question: How to write PDF trailer / XML metadata to a document? (original) (raw)
I am trying to load a PDF, copy its pages to another PDF, and also copy the XML metadata stream to the new PDF document. I cannot figure out how to do this currently. I see that there is a way to modify the XML metadata stream: https://pymupdf.readthedocs.io/en/latest/faq.html#how-to-access-xml-metadata
(btw: code sample needs updating, updatefStream should become updateStream)
and I see there is a function to delete the stream: _delXmlMetadata.
But how does one add a stream to a document? I have tried this, but it doesn't work:
import fitz
import sys
doc = fitz.open(sys.argv[1])
outpdf = fitz.open()
outfile = sys.argv[2]
for page in doc:
outpdf.insertPDF(doc, from_page=page.number, to_page=page.number)
outpdf.setMetadata(doc.metadata)
print('Trailer:', doc._getTrailerString(False))
metaxref = doc._getXmlMetadataXref()
if metaxref > 0:
print('Has XML metadata:', metaxref)
obj = doc.xrefObject(metaxref)
print('object:', obj)
xmlmetadata = doc.xrefStream(metaxref)
test = outpdf._getNewXref()
outpdf.updateObject(test, obj)
print('Writing:', xmlmetadata)
outpdf.updateStream(test, xmlmetadata, new=True)
outpdf.save(outfile)
But that doesn't seem to work (it was a long shot anyway, to create a new xref and hope that writing the info to it would somehow work).
Is this a missing feature, or did I miss something in the documentation?