Memory leaking in rewrite_images? (original) (raw)

Description of the bug

I noticed memory building up in my application and traced this to the rewrite_images call, if I remove this, everything is fine.
Not sure if I miss some cleanup? I tried pymupdf.TOOLS.store_shrink(100) but that didn't help.

How to reproduce the bug

import gc from pathlib import Path

import pymupdf

def main() -> None: dpi_threshold: int = 150 dpi_target: int = 100 quality: int = 50

input_pdf: Path = Path("input.pdf")
output_pdf: Path = Path("output.pdf")

for i in range(10):
    with pymupdf.open(input_pdf) as doc:
        doc.rewrite_images(
            dpi_threshold=dpi_threshold,
            dpi_target=dpi_target,
            quality=quality,
        )

        save_opts = {
            "garbage": 4,  # Maximum garbage collection
            "deflate": True,  # Use deflate compression
            "clean": True,  # Clean up redundant objects
            "pretty": False,  # Don't pretty-print (saves space)
            "ascii": False,  # Don't use ASCII encoding (saves space)
            "expand": 0,  # Don't expand content streams
            "linear": False,  # Don't linearize (can increase size)
            "deflate_images": True,  # Compress images
            "deflate_fonts": True,  # Compress fonts
            "use_objstms": True,  # Use object streams for better compression
            "compression_effort": True,  # Use maximum compression effort
        }

        doc.save(
            output_pdf,
            **save_opts,
        )
        print(f"Done {i}")

gc.collect()
pymupdf.TOOLS.store_shrink(1000)
print("Cleaned")

if name == "main": main()

My test PDF has 6 pages with full screen images.
I created a memray graph: memray-flamegraph-mem_test.py.99494.html

PyMuPDF version

1.26.4

Operating system

MacOS

Python version

3.13