Question: How to write PDF trailer / XML metadata to a document? (original) (raw)

I am trying to load a PDF, copy its pages to another PDF, and also copy the XML metadata stream to the new PDF document. I cannot figure out how to do this currently. I see that there is a way to modify the XML metadata stream: https://pymupdf.readthedocs.io/en/latest/faq.html#how-to-access-xml-metadata

(btw: code sample needs updating, updatefStream should become updateStream)

and I see there is a function to delete the stream: _delXmlMetadata.

But how does one add a stream to a document? I have tried this, but it doesn't work:

import fitz
import sys

doc = fitz.open(sys.argv[1])
outpdf = fitz.open()
outfile = sys.argv[2]

for page in doc:
    outpdf.insertPDF(doc, from_page=page.number, to_page=page.number)

outpdf.setMetadata(doc.metadata)

print('Trailer:', doc._getTrailerString(False))
metaxref = doc._getXmlMetadataXref()

if metaxref > 0:
    print('Has XML metadata:', metaxref)
    obj = doc.xrefObject(metaxref)
    print('object:', obj)
    xmlmetadata = doc.xrefStream(metaxref)

    test = outpdf._getNewXref()
    outpdf.updateObject(test, obj)
    print('Writing:', xmlmetadata)
    outpdf.updateStream(test, xmlmetadata, new=True)

outpdf.save(outfile)

But that doesn't seem to work (it was a long shot anyway, to create a new xref and hope that writing the info to it would somehow work).

Is this a missing feature, or did I miss something in the documentation?