New 'rebased' implementation of PyMuPDF · pymupdf/PyMuPDF · Discussion #2680 (original) (raw)
New 'rebased' implementation of PyMuPDF
Overview
We are migrating PyMuPDF to a new 'rebased' implementation which uses the MuPDF Python and C++ APIs, instead of the MuPDF C API used by the original 'classic' implementation of PyMuPDF.
The rebased implementation will behave identically to the classic implementation, and will not require any changes to user code.
Advantages of the rebased implementation compared to classic
- User access to the underlying MuPDF Python API.
The MuPDF Python API will be available asfitz.mupdf- this is not possible with classic PyMuPDF, and can give useful flexibility to the user. - Simplified implementation.
The underlying MuPDF C++/Python APIs use automated reference counting, automatic contexts, and native C++ and Python exceptions, and this makes the rebased implementation simpler than the classic implementation.
This also helps development of new PyMuPDF functionality. - Optional tracing of MuPDF C function calls using environment variables.
This is a feature of the MuPDF C++ and Python APIs, and can be very useful during development and when reporting bugs. - Possible future support for multithreaded use.
The classic implementation of PyMuPDF is explicitly single-threaded, but the MuPDF C++/Python APIs have full support for threads with automated per-thread contexts.
Migration to the rebased implementation
We will migrate to the new rebased implementation in the following stages:
- Stage 1: one or more releases containing two modules,
fitzandfitz_new.- Default from
import fitzis the classic implementation. - You can try the rebased implementation with
import fitz_new as fitz(no other changes to your code are needed). - PyMuPDF-1.23.3 is the first release of stage 1.
- Default from
- Stage 2: one or more releases containing two modules,
fitzandfitz_old.- Default from
import fitzis the rebased implementation. - Force use of the classic implementation with
import fitz_old as fitz.
- Default from
- Stage 3: subsequent releases will have module
fitzonly.- Default from
import fitzis the rebased implementation. - The classic implementation is not available.
- Default from
During stage 1 we would like you to try out the rebased implementation by using import fitz_new as fitz, and report any issues you come across.
When the rebased implementation is thought to work as well as the classic implementation, we will move to stage 2, where users will get the rebased implementation by default. If users come across problems with the rebased implementation in stage 2, they can revert to the classic implementation by using import fitz_old as fitz. It is important that users report any such problems so we can fix the rebased implementation.
Finally when we are fully confident that the rebased implementation is working for all users, we will move to stage 3, where only the rebased implementation will be available.
Impact on users
Users of PyMuPDF will be able to carry on using PyMuPDF throughout the migration without making any changes to their code.
The rebased implementation passes PyMuPDF's test suite, but of course this doesn't check everything, so it is possible that some users will come across issues during the migration, especially at stage 2 where the rebased implementation becomes the default.
The best way to protect against this happening is to try out the rebased implementation in stage 1 by using import fitz_new as fitz, and report any problems so they can be fixed.