TypeError: in method 'delete_Pixmap', argument 1 of type 'struct Pixmap *' (original) (raw)

Describe the bug

PyMuPDF fails with an exception which cannot be caught or traced back to its origin:

Exception ignored in: <object repr() failed> Traceback (most recent call last): File "/home/user/lib/python3.6/site-packages/fitz/fitz.py", line 6977, in del self.swig_destroy(self) TypeError: in method 'delete_Pixmap', argument 1 of type 'struct Pixmap *'

This seems to be related to a broken(?) stencil mask:

Traceback (most recent call last): File "/home/user/tmp/test.py", line 47, in test_file pixmap2 = fitz.Pixmap(document, stencil_mask) File "/home/user/lib/python3.6/site-packages/fitz/fitz.py", line 6546, in init this = _fitz.new_Pixmap(*args) RuntimeError: not an image

To Reproduce

In my case, I am trying to extract the images embedded inside a PDF file from within a Python unittest.TestCase:

    import fitz

    file_path = '/home/user/tmp/test_dataset/file.pdf'
    document = fitz.Document(file_path)
    for page_number in range(document.page_count):
        for image_index, image in enumerate(document.get_page_images(page_number)):
            xref = image[0]
            image_dict = document.extract_image(xref)
            if not image_dict:
                continue

            xref = image[0]
            stencil_mask = image[1]
            print(xref)

            pixmap = fitz.Pixmap(document, xref)
            if stencil_mask > 0:
                pixmap1 = fitz.Pixmap(pixmap)
                try:
                    # Move the following line out of the exception handling
                    # to get the second traceback in the previous section.
                    pixmap2 = fitz.Pixmap(document, stencil_mask)
                    pixmap1.setAlpha(pixmap2.samples)
                except RuntimeError as error:
                    print(error)
                else:
                    pixmap = pixmap1

            if pixmap.colorspace:
                pixmap = fitz.Pixmap(fitz.csRGB, pixmap)
            pixmap.save(f'/home/user/tmp/output_{xref}.png')

Plain MuPDF does not seem to fail, although image-0032.png is mirrored horizontally:

user@host:~/tmp/test_dataset> ~/tmp/mupdf-1.19.0-source/build/release/mutool info file.pdf 
file.pdf:

PDF-1.4
Info object (37 0 R):
<</Creator(TextMaker)/CreationDate(D:20200531120726)/Title<FEFF0041006E0072006500640065>/Author<FEFF004300610072006F006C0061>/Producer(TextMaker)>>
Pages: 1

Retrieving info from pages 1-1...
Mediaboxes (1):
    1	(34 0 R):	[ 0 0 595.3 841.9 ]

Fonts (4):
    1	(34 0 R):	Type0 'ComicSansMS' Identity-H (15 0 R)
    1	(34 0 R):	TrueType 'ComicSansMS' WinAnsiEncoding (19 0 R)
    1	(34 0 R):	Type0 'ComicSansMS-Bold' Identity-H (25 0 R)
    1	(34 0 R):	TrueType 'ComicSansMS-Bold' WinAnsiEncoding (29 0 R)

Images (1):
    1	(34 0 R):	[ Flate ] 731x262 8bpc DevRGB (32 0 R)

user@host:~/tmp/test_dataset> ~/tmp/mupdf-1.19.0-source/build/release/mutool extract file.pdf
extracting font-0012.ttf
extracting font-0013.ttf
extracting font-0022.ttf
extracting font-0023.ttf
extracting image-0032.png

The PDF file has been generated by another person. As it contains confidential information, I cannot share it here.

Expected behavior

PyMuPDF behaves the same as MuPDF and does not error out. In case of an error, it can be caught as usual to further work with it.

Your configuration