pymupdf.mupdf.FzErrorFormat: code=7: cannot find object in xref error encountered after version 1.25.3 (original) (raw)

Description of the bug

Hi Team. Based on some criteria, I have written a script that removed some text, overlapping images, and vector graphics from a PDF.
Two days ago, we upgraded the PyMuPDF version on our server from 1.25.3 to 1.25.4. Today, we received the following Error Exception raised corresponding to a PDF file:

[ERROR] 2025-04-01 14:58:43 Example.pdf - sample Traceback (most recent call last):
  File "/Users/user/Documents/sample.py", line 45, in sample_func
    doc.ez_save(dst_pdf)
  File "/Users/user/miniconda3/envs/test4_env/lib/python3.10/site-packages/pymupdf/__init__.py", line 4223, in ez_save
    return self.save(
  File "/Users/user/miniconda3/envs/test4_env/lib/python3.10/site-packages/pymupdf/__init__.py", line 5584, in save
    mupdf.pdf_write_document(pdf, out, opts)
  File "/Users/user/miniconda3/envs/test4_env/lib/python3.10/site-packages/pymupdf/mupdf.py", line 53942, in pdf_write_document
    return _mupdf.pdf_write_document(doc, out, opts)
pymupdf.mupdf.FzErrorFormat: code=7: cannot find object in xref (21 0 R)

Today, I checked that a new version of PyMuPDF has released, that is, 1.25.5. I upgraded my server to that version to see if the error goes away, but it persisted. I also experimented with different save parameters and their values to see if the error resolves (an example below), but the same error persisted.

# doc.ez_save(dst_pdf)
doc.save(dst_pdf, garbage=4, clean=True, deflate=True, use_objstms=1)

I then downgraded the version to 1.25.3. The script execution resulted in the following error message but no Error Exception was raised and the file got saved successfully:

MuPDF error: format error: cannot find object in xref (21 0 R)

How to reproduce the bug

Sharing below my example script for your reference. I have changed the script to keep the broad logic same. The script does contain all the PyMuPDF methods that I have used in the original script.

import logging
from io import BytesIO
import fitz

logger = logging.getLogger(__file__)

TARGET_TEXT = "xyz"

def sample_func(src_pdf):
    if isinstance(src_pdf, BytesIO):
        # if input_pdf is a BytesIO object
        src_pdf.seek(0)
        doc = fitz.open(stream=src_pdf, filetype="pdf")
    elif isinstance(src_pdf, str):
        doc = fitz.open(src_pdf)

    for page_num in range(len(doc)):
        # Load the page
        page = doc.load_page(page_num)
        logger.info(f"page_num: {page_num + 1}")

        text_blocks = page.get_text("dict")["blocks"]
        for block in text_blocks:
            if block["type"] == 0:  # text block
                for line in block["lines"]:
                    for span in line["spans"]:                            
                        text_rect = fitz.Rect(span['bbox'])
                        logger.debug(f"span: {span}")
                                
                        # Extract text within the specified rectangle
                        text = page.get_text("text", clip=text_rect).strip()

                        if text == TARGET_TEXT:
                            # Create redaction annotation
                            redact_annot = page.add_redact_annot(text_rect)

                            # images=2 blanks out overlapping pixels
                            # graphics=2 removes any overlapping vector graphics
                            # text=0 removes all characters whose boundary box overlaps any redaction rectangle
                            page.apply_redactions(images=2, graphics=2, text=0)


    # Save the modified document
    dst_pdf = BytesIO()
    doc.ez_save(dst_pdf)
    
    doc.close()

    dst_pdf.seek(0)
    return dst_pdf.read()

Following is the Error Exception raised on running the above script:

[ERROR] 2025-04-01 14:58:43 Example.pdf - sample Traceback (most recent call last):
  File "/Users/user/Documents/sample.py", line 45, in sample_func
    doc.ez_save(dst_pdf)
  File "/Users/user/miniconda3/envs/test4_env/lib/python3.10/site-packages/pymupdf/__init__.py", line 4223, in ez_save
    return self.save(
  File "/Users/user/miniconda3/envs/test4_env/lib/python3.10/site-packages/pymupdf/__init__.py", line 5584, in save
    mupdf.pdf_write_document(pdf, out, opts)
  File "/Users/user/miniconda3/envs/test4_env/lib/python3.10/site-packages/pymupdf/mupdf.py", line 53942, in pdf_write_document
    return _mupdf.pdf_write_document(doc, out, opts)
pymupdf.mupdf.FzErrorFormat: code=7: cannot find object in xref (21 0 R)

PyMuPDF version

1.25.5

Operating system

Linux

Python version

3.10