Memory growth when inserting image pages following initial insert_pdf (original) (raw)
This is quite possibly a dupe of #1351 but I foolishly did most my testing before checking here, so I'll file it with my MWE. If that is not helpful, feel free to close as a dupe of #1351.
I have checked on 1.18.17 and 1.19.2. I have not yet checked on older system.
I have a script that uses insert_pdf to put a PDF coverpage into a new document. Then various png files are inserted, one per page. The more pngs I add per PDF or the more PDFs total, the larger the memory footprint and it never seems to be garbage collected. This seems consistent with #1351.
Here's a MWE:
import fitz
def assemble(outname, coverfile, img_list): """Assemble a pdf from a coverfile and images."""
## MEMORY GROWTH
doc = fitz.open()
cover = fitz.open(coverfile)
doc.insert_pdf(cover)
# (these don't help)
#cover.close()
#cover = None
## INSTEAD, then no growth
#doc = fitz.open(coverfile)
for img_filename in img_list:
page = doc.new_page()
rect = page.bound()
rect += [10, 10, -10, -10]
page.insert_image(rect, filename=img_filename)
doc.save(outname, deflate=True)
# this explcit close DOES HELP!
#doc.close()
#doc = Nonedef main(): img_list = ["map.png"] for n in range(0, 50): print(n) assemble(f"out{n}.pdf", "cover.pdf", img_list)
if name == "main": main()
This further uses a "cover.pdf" and "map.png": I'll attach but I don't think it matters much what they are.
Two possible workarounds:
- explicit
doc.close()(although notable notcover.pdf). doc = fitz.open(coverfile)instead of starting a blank document.
Other comments:
- In my MWE, I use a single png file in. If I use
img_list = ["map.png"]*10(10 copies of same png), one per page, I get the same memory use. But 10 different png would give significantly higher memory use.