Memory growth when inserting image pages following initial insert_pdf (original) (raw)

This is quite possibly a dupe of #1351 but I foolishly did most my testing before checking here, so I'll file it with my MWE. If that is not helpful, feel free to close as a dupe of #1351.

I have checked on 1.18.17 and 1.19.2. I have not yet checked on older system.

I have a script that uses insert_pdf to put a PDF coverpage into a new document. Then various png files are inserted, one per page. The more pngs I add per PDF or the more PDFs total, the larger the memory footprint and it never seems to be garbage collected. This seems consistent with #1351.

Here's a MWE:

import fitz

def assemble(outname, coverfile, img_list): """Assemble a pdf from a coverfile and images."""

## MEMORY GROWTH
doc = fitz.open()
cover = fitz.open(coverfile)
doc.insert_pdf(cover)
# (these don't help)
#cover.close()
#cover = None

## INSTEAD, then no growth
#doc = fitz.open(coverfile)

for img_filename in img_list:
    page = doc.new_page()
    rect = page.bound()
    rect += [10, 10, -10, -10]
    page.insert_image(rect, filename=img_filename)
doc.save(outname, deflate=True)
# this explcit close DOES HELP!
#doc.close()
#doc = None

def main(): img_list = ["map.png"] for n in range(0, 50): print(n) assemble(f"out{n}.pdf", "cover.pdf", img_list)

if name == "main": main()

This further uses a "cover.pdf" and "map.png": I'll attach but I don't think it matters much what they are.

Two possible workarounds:

Other comments:

map.png
cover.pdf