TypeError: in method 'delete_Pixmap', argument 1 of type 'struct Pixmap *' (original) (raw)
Describe the bug
PyMuPDF fails with an exception which cannot be caught or traced back to its origin:
Exception ignored in: <object repr() failed> Traceback (most recent call last): File "/home/user/lib/python3.6/site-packages/fitz/fitz.py", line 6977, in del self.swig_destroy(self) TypeError: in method 'delete_Pixmap', argument 1 of type 'struct Pixmap *'
This seems to be related to a broken(?) stencil mask:
Traceback (most recent call last): File "/home/user/tmp/test.py", line 47, in test_file pixmap2 = fitz.Pixmap(document, stencil_mask) File "/home/user/lib/python3.6/site-packages/fitz/fitz.py", line 6546, in init this = _fitz.new_Pixmap(*args) RuntimeError: not an image
To Reproduce
In my case, I am trying to extract the images embedded inside a PDF file from within a Python unittest.TestCase:
import fitz
file_path = '/home/user/tmp/test_dataset/file.pdf'
document = fitz.Document(file_path)
for page_number in range(document.page_count):
for image_index, image in enumerate(document.get_page_images(page_number)):
xref = image[0]
image_dict = document.extract_image(xref)
if not image_dict:
continue
xref = image[0]
stencil_mask = image[1]
print(xref)
pixmap = fitz.Pixmap(document, xref)
if stencil_mask > 0:
pixmap1 = fitz.Pixmap(pixmap)
try:
# Move the following line out of the exception handling
# to get the second traceback in the previous section.
pixmap2 = fitz.Pixmap(document, stencil_mask)
pixmap1.setAlpha(pixmap2.samples)
except RuntimeError as error:
print(error)
else:
pixmap = pixmap1
if pixmap.colorspace:
pixmap = fitz.Pixmap(fitz.csRGB, pixmap)
pixmap.save(f'/home/user/tmp/output_{xref}.png')Plain MuPDF does not seem to fail, although image-0032.png is mirrored horizontally:
user@host:~/tmp/test_dataset> ~/tmp/mupdf-1.19.0-source/build/release/mutool info file.pdf
file.pdf:
PDF-1.4
Info object (37 0 R):
<</Creator(TextMaker)/CreationDate(D:20200531120726)/Title<FEFF0041006E0072006500640065>/Author<FEFF004300610072006F006C0061>/Producer(TextMaker)>>
Pages: 1
Retrieving info from pages 1-1...
Mediaboxes (1):
1 (34 0 R): [ 0 0 595.3 841.9 ]
Fonts (4):
1 (34 0 R): Type0 'ComicSansMS' Identity-H (15 0 R)
1 (34 0 R): TrueType 'ComicSansMS' WinAnsiEncoding (19 0 R)
1 (34 0 R): Type0 'ComicSansMS-Bold' Identity-H (25 0 R)
1 (34 0 R): TrueType 'ComicSansMS-Bold' WinAnsiEncoding (29 0 R)
Images (1):
1 (34 0 R): [ Flate ] 731x262 8bpc DevRGB (32 0 R)
user@host:~/tmp/test_dataset> ~/tmp/mupdf-1.19.0-source/build/release/mutool extract file.pdf
extracting font-0012.ttf
extracting font-0013.ttf
extracting font-0022.ttf
extracting font-0023.ttf
extracting image-0032.png
The PDF file has been generated by another person. As it contains confidential information, I cannot share it here.
Expected behavior
PyMuPDF behaves the same as MuPDF and does not error out. In case of an error, it can be caught as usual to further work with it.
Your configuration
- OpenSUSE Leap 15.3, x86_64
- Python 3.6, 64 bit
- PyMuPDF 1.19.4, statically linked with MuPDF 1.19.0-rc2, generated from source.