Installation - PyMuPDF documentation (original) (raw)

Requirements

All the examples below assume that you are running inside a Python virtual environment. See: https://docs.python.org/3/library/venv.html for details. We also assume that pip is up to date.

For example:

Installation

PyMuPDF should be installed using pip with:

pip install --upgrade pymupdf

This will install from a Python wheel if one is available for your platform.

Installation when a suitable wheel is not available

If a suitable Python wheel is not available, pip will automatically build from source using a Python sdist.

This requires C/C++ development tools to be installed:

The build will automatically download and build MuPDF.

Problems after installation

Notes

Build and install from a local PyMuPDF source tree

Initial setup:

Then one can build PyMuPDF in two ways:

Also, one can build for different Python versions in the same PyMuPDF tree:

Running tests

Having a PyMuPDF tree available allows one to run PyMuPDF’s pytest test suite:

pip install pytest fontTools pytest PyMuPDF/tests

Notes about using a non-default MuPDF

Using a non-default build of MuPDF by setting environmental variablePYMUPDF_SETUP_MUPDF_BUILD can cause various things to go wrong and so is not generally supported:

Official PyMuPDF Linux wheels may not install on older Linux systems

Releases of PyMuPDF are incompatible with older Linux systems.

For example as of 2025-09-03, pip install pymupdf does not work on some AWS Lambda systems - see https://github.com/pymupdf/PyMuPDF/discussions/4631.

This is because official PyMuPDF Linux wheels are built with a version of glibc determined by the current Python manylinux environment. These wheels are incompatible with Linux systems that have an older glibc.

The official Python manylinux environment is updated periodically to use newer glibc versions, so new releases of PyMuPDF become increasingly incompatible with older Linux systems.

There is nothing that can be done about this, other than updating older Linux systems, or building PyMuPDF locally from source.

For more details, please see: Python Packaging Authority.

Packaging

See Packaging for Linux distributions.

Using with Pyodide

See Pyodide.

Enabling Integrated OCR Support

PyMuPDF will already contain all the logic to support OCR functions. But it additionally does need Tesseract’s language support data.

If not specified explicitly, PyMuPDF will attempt to find the installed Tesseract’s tessdata, but this should probably not be relied upon.

Otherwise PyMuPDF requires that Tesseract’s language support folder is specified explicitly either in PyMuPDF OCR functions’ tessdata arguments oros.environ["TESSDATA_PREFIX"].

So for a working OCR functionality, make sure to complete this checklist:

  1. Locate Tesseract’s language support folder. Typically you will find it here:
    • Windows: C:/Program Files/Tesseract-OCR/tessdata
    • Unix systems: /usr/share/tesseract-ocr/4.00/tessdata
  2. Specify the language support folder when calling PyMuPDF OCR functions:
    • Set the tessdata argument.
    • Or set os.environ["TESSDATA_PREFIX"] from within Python.
    • Or set environment variable TESSDATA_PREFIX before running Python, for example:
      * Windows: setx TESSDATA_PREFIX "C:/Program Files/Tesseract-OCR/tessdata"
      * Unix systems: declare -x TESSDATA_PREFIX=/usr/share/tesseract-ocr/4.00/tessdata

Note

English language support is included by default in Tesseract installation.

Tesseract Language Packs for other languages must be installed separately, and the tessdata folder must be specified to PyMuPDF as described above, for OCR to work with those languages.