Python Zip Imports: Distribute Modules and Packages Quickly (original) (raw)

Python allows you to import code from ZIP files directly through Zip imports. This interesting built-in feature enables you to zip Python code for distribution purposes. Zip imports also help if you often work with Python code that comes in ZIP files. In either case, learning to create importable ZIP files and to import code from them will be a valuable skill.

Even if your day-to-day workflow doesn’t involve ZIP files containing Python code, you’ll still learn some fun and interesting new skills by exploring Zip imports through this tutorial.

In this tutorial, you’ll learn:

You’ll also learn how to use the zipimport module to dynamically import code from ZIP files without adding them to Python’s module search path. To do this, you’ll code a minimal plugin system that loads Python code from ZIP files.

To get the most out of this tutorial, you should have previous knowledge of how Python’s import system works. You should also know the basics of manipulating ZIP files with zipfile, working with files, and using the with statement.

Get to Know Python Zip Imports

Since Python 2.3, you can import modules and packages from inside ZIP files. This feature is known as Zip imports and is quite helpful when you need to distribute a complete package as a single file, which is its most common use case.

PEP 273 introduced Zip imports as a built-in feature. The feature was widely accepted as a must-have among the Python community because distributing several separate .py, .pyc, and .pyo files isn’t always appropriate and efficient.

Zip imports can simplify the process of sharing and distributing your code so that your colleagues and end users don’t have to fumble around trying to extract the files into the right location to get the code working.

PEP 302 added a series of import hooks that provides built-in support for Zip imports. If you want to import modules and packages from a ZIP file, then you just need the file to appear in Python’s module search path.

The module search path is a list of directories and ZIP files. It lives in sys.path. Python automatically searches through items in this list when you run an import statement in your code.

In the following sections, you’ll learn how to create ready-to-import ZIP files using different Python tools and techniques. You’ll also learn about a few ways to add those files to your current Python’s module search path. Finally, you’ll dig into zipimport, the module that supports the Zip import feature behind the scenes.

Create Your Own Importable ZIP Files

Zip imports allow you to quickly distribute code that’s organized across several modules and packages as a single file. Python has you covered when it comes to creating importable ZIP files. The zipfile module from the standard library includes a class called ZipFile for manipulating ZIP files. It also includes a more specialized class called PyZipFile, which facilitates the creation of importable ZIP files.

PyZipFile lets you bundle Python code into ZIP files quickly and efficiently. The class inherits from ZipFile, so it shares the same base interface. However, there are two main differences between these classes:

  1. The initializer of PyZipFile takes an optional argument called optimize, which allows you to optimize the Python code by compiling it to bytecode before archiving it.
  2. The PyZipFile class provides a method called .writepy(), which accepts a Python module or package as an argument and adds it to a target ZIP file.

If optimize is -1, its default value, then the input .py files are automatically compiled to .pyc files and then added to the target archive. Why does this happen? Packaging .pyc files rather than the original .py files makes the importing process way more efficient by skipping the compilation step. You’ll learn more about this topic in upcoming sections.

In the following two sections, you’ll get your hands dirty and start creating your own importable ZIP files containing modules and packages.

Bundle Python Modules Into ZIP Files

In this section, you’ll use PyZipFile.writepy() to compile a .py file down to bytecode and add the resulting .pyc file to a ZIP archive. To try .writepy() out, say that you have a hello.py module:

This module defines a function called greet() that takes name as an argument and prints a friendly greeting message to the screen.

Now say that you want to package this module into a ZIP file that you can import later. To do this, you can run the following code:

After running this code, you’ll have a hello.zip file in your current working directory. The call to .writepy() on zip_module automatically compiles hello.py to hello.pyc and stores it in the underlying ZIP file, hello.zip. That’s why .printdir() displays hello.pyc instead of your original hello.py file. This automatic compilation ensures an efficient import process.

You can also manually package .py and .pyc files into a ZIP file by using any regular file archiver. If the resulting archive contains .py files without the corresponding .pyc files, then Python will compile them the first time you import from that specific ZIP file.

Python won’t modify the underlying ZIP file to add the newly compiled .pyc files. So the next time you run the import, Python will compile the code again. This behavior will make the import process slower.

You can also pass a directory as the first argument to .writepy(). If the input directory isn’t a Python package, then the method scans it for .py files, compiles them to .pyc files, and adds those .pyc files at the top level of the target ZIP file. The scanning step isn’t recursive, which means that subdirectories aren’t scanned for source files.

You can tweak the compilation process further by setting the optimize argument of PyZipFile to one of the following values:

With these values, you can fine-tune the level of optimization you want to use when .writepy() compiles your .py files to .pyc files before archiving them.

So far, you’ve learned how to bundle one or more modules into a ZIP file. In your day-to-day coding, you might also need to zip a complete Python package. You’ll learn how to do that in the following section.

Bundle Python Packages Into ZIP Files

You can also bundle Python packages into ZIP files by using PyZipFile and its .writepy() method. As you already learned, if you pass a regular directory as the first argument to .writepy(), then the method scans the directory for .py files, compiles them, and adds the corresponding .pyc files to the resulting ZIP file.

On the other hand, if the input directory is a Python package, then .writepy() compiles all the .py files and adds them to the ZIP file, keeping the package’s internal structure.

To try .writepy() with a Python package, create a new hello/ directory and copy your hello.py file into it. Then add an empty __init__.py module to turn the directory into a package. You should end up with the following structure:

hello/ | ├── __init__.py └── hello.py

Now suppose that you want to bundle this package into a ZIP file for distribution purposes. If that’s the case, then you can run the following code:

The call to .writepy() takes the hello package as an argument, searches for .py files inside it, compiles them to .pyc files, and finally adds them to the target ZIP file, keeping the same package structure.

Understand the Limitations of Zip Imports

When you use ZIP files to distribute Python code, you need to consider a few limitations of Zip imports:

You can include any type of file in your ZIP archives. However, when your users import code from these archives, only .py, .pyw, .pyc, and .pyo files are read. Importing code from dynamic files, such as .pyd, .dll, and .so, isn’t possible if they live in a ZIP file. For example, you can’t load shared libraries and extension modules written in C from ZIP archives.

You can work around this limitation by extracting dynamic modules from your ZIP files, writing them to the file system, and then loading their code. However, that means you need to create temporary files and deal with possible errors and security risks, which can complicate things.

Zip imports can also imply a performance compromise, as you learned earlier in this tutorial. If your archive contains .py modules, then Python will compile them to satisfy the imports. However, it won’t save the corresponding .pyc files. This behavior may reduce the performance of import operations.

Finally, if you need to import code from a compressed ZIP file, then zlib must be available in your working environment for decompression purposes. Importing code from compressed archives fails with a missing zlib message if this library isn’t available. Additionally, the decompression step adds extra performance overhead to the import process. For these reasons, you’ll use uncompressed ZIP files in this tutorial.

Import Python Code From ZIP Files

Up to this point, you’ve learned how to create your own importable ZIP files for distribution purposes. Now say that you’re at the other end, and you’re getting ZIP files with Python modules and packages. How can you import code from them? In this section, you’ll get answers to this question and learn how to make ZIP files available for importing their content.

For Python to import code from a ZIP file, that file must be available in Python’s module search path, which is stored in sys.path. This module-level variable holds a list of strings specifying the search path for modules. The content of path includes:

The following table points out a few ways to add your ZIP files to sys.path:

Option Target Code or Interpreter
The list.insert(), list.append(), and list.extend() methods The Python code that you’re writing and running
The PYTHONPATH environment variable Every Python interpreter that you run on your system
A Python path configuration file, or .pth files The Python interpreter that contains the .pth file

In the following sections, you’ll explore these three ways to add items to sys.path so that you can make your ZIP files available for importing their content.

Use sys.path Dynamically for Zip Imports

Because sys.path is a list object, you can manipulate it from your Python code by using regular list methods. In general, to add new items to a list object, you can use .insert(), .append(), or .extend().

Typically, you’ll use .insert(0, item) to add new items to sys.path from your Python code. Calling .insert() this way inserts item at the beginning of the list, ensuring that your newly added item has precedence over the existing ones. Having item at the beginning enables you to shadow existing modules and packages when name collisions are possible.

Now say that you need to add the hello.zip file containing your hello.py module to your current Python’s sys.path. In this case, you can run the code in the example below. Note that for this example to work on your machine, you need to provide the correct path to hello.zip:

Once you’ve added the path to hello.zip to your sys.path, then you can import objects from hello.py as you would with any regular module.

If, like hello_pkg.zip, your ZIP file contains a Python package, then you can add it to sys.path too. In this case, the imports should be package-relative:

Because your code is in a package now, you need to import the hello module from the hello package. Then you can access the greet() function as usual.

Another option for adding items to sys.path is to use .append(). This method takes a single object as an argument and adds it to the end of the underlying list. Restart your Python interactive session and run the code that provides the path to hello.zip:

This technique works similarly to using .insert(). However, the path to your ZIP file is now at the end of sys.path. If any preceding item in the list contains a module called hello.py, then Python will import from that module instead of from your newly added hello.py module.

You can also use .append() in a loop to add several files to sys.path, or you can just use .extend(). This method takes an iterable of items and adds its content to the end of the underlying list. As with .append(), keep in mind that .extend() will add your files to the end of sys.path, so existing names can shadow modules and packages in your ZIP files.

Use PYTHONPATH for System-Wide Zip Imports

In some situations, you may need a given ZIP file to be available for importing its content from any script or program that you run on your computer. In these situations, you can use the PYTHONPATH environment variable to make Python automatically load your archive into sys.path whenever you run the interpreter.

PYTHONPATH uses the same format as the PATH environment variable, a list of directory paths separated by os.pathsep. On Unix systems, such as Linux and macOS, this function returns a colon (:), while on Windows, it returns a semicolon (;).

For example, if you’re on Linux or macOS, then you can add your hello.zip file to PYTHONPATH by running the following command:

This command adds /path/to/hello.zip to your current PYTHONPATH and exports it so that it’s available in the current terminal session.

Now you can issue the python command to run the interpreter. Once you’re there, check the content of sys.path as usual:

Cool! Your hello.zip file is in the list. From this point on, you’ll be able to import objects from hello.py as you did in the above section. Go ahead and give it a try!

An important point to note in the above output is that your hello.zip file isn’t at the beginning of sys.path, which implies that a same-named module that appears earlier will take precedence over your hello module, according to how Python handles its module seach path.

To add an item to PYTHONPATH on a Windows system, you can execute a command in your cmd.exe window:

This command adds C:\path\to\hello.zip to the current content of the PYTHONPATH variable on your Windows machine. To check it out, run the Python interpreter in the same command prompt session and look at the content of sys.path, as you did before.

Adding directories and ZIP files to the PYTHONPATH environment variable makes those entries available for whatever Python interpreter you run under the terminal session at hand. Finally, it’s important to note that Python will silently ignore nonexistent directories and ZIP files listed in PYTHONPATH, so keep an eye on that.

Use a .pth File for Interpreter-Wide Zip Imports

Sometimes you may want to import code from a given ZIP file only when you’re running a specific Python interpreter. This is useful when you have a project that uses code from that ZIP file, and you don’t want the code to be available for the rest of your projects.

Python’s path configuration files allow you to extend the sys.path of a given interpreter with your custom directories and ZIP files.

A path configuration file uses the .pth file extension and can hold a list of paths to directories and ZIP files, one per line. This list of paths is added to sys.path every time you run the Python interpreter that provides the .pth file.

Python’s .pth files have a straightforward format:

Once you have a suitable .pth file, you need to copy it to one of the site directories so that Python can find it and load its content. To get the site directories of your current Python environment, you can call getusersitepackages() from the site module. If you don’t have admin privileges on your current machine, then you can use the user site directory at site.USER_SITE.

For example, the following command creates a hello.pth path configuration file for the system-wide Python 3 interpreter on Ubuntu:

This command creates hello.pth, using the GNU nano text editor as root. Once there, type in the path to your hello.zip file. Save the file by pressing Ctrl+X, then Y, and finally Enter. Now this ZIP file will be available in sys.path when you launch the system Python interpreter again:

That’s it! From this point on, you can import objects from hello.py as long as you use the system-wide Python interpreter.

Again, nonexistent directories and ZIP files won’t be added to sys.path when Python reads and loads the content of a given .pth file. Finally, repeated entries in a .pth file are added only once to sys.path.

You’ve already used the zipimport module from the standard library without even knowing it. Behind the scenes, Python’s built-in import mechanism uses this module automatically when a sys.path item holds the path to a ZIP file. In this section, you’ll learn how zipimport works and how to use it explicitly in your code with a practical example.

Understand the Basics of zipimport

The main component of zipimport is zipimporter. This class takes the path to a ZIP file as an argument and creates an importer instance. Here’s an example of how to use zipimporter and some of its attributes and methods:

In this example, you first import zipimporter from zipimport. Then you create a zipimporter instance with the path to your hello.zip file.

The zipimporter class provides several useful attributes and methods. For example, .is_package() returns True if the input name is a package and False otherwise. The .get_filename() method returns the path (.__file__) to a given module inside the archive.

If you want to bring the module’s name into your current namespace, then you can use .load_module(), which returns a reference to the input module. With that reference, you can access any code object from the module as usual.

Build a Plugin System With zipimport

As you learned above, Python internally uses zipimport to load code from ZIP files. You also learned that this module provides tools that you can use in some real-life coding situations. For example, say that you want to implement a custom plugin system in which each plugin lives in its own ZIP file. Your code should search for ZIP files in a given folder and automatically import the plugin’s functionality.

To experience this example in action, you’ll implement two toy plugins that take a message and a title and show them in both your default web browser and a Tkinter message box. Each plugin should live in its own directory, in a module called plugin.py. This module should implement the plugin’s functionality and provide a main() function as the plugin’s entry point.

Go ahead and create a folder called web_message/ with a plugin.py file in it. Open the file in your favorite code editor or IDE and type in the following code for the web browser plugin:

The main() function in this code takes a text message and a window title. Then it creates a NamedTemporaryFile in a with statement. The file will contain a minimal HTML document showing title and text on the page. To open this file in your default web browser, you use webbrowser.open().

The next plugin provides similar functionality but uses the Tkinter toolkit. The code for this plugin should also live in a module called plugin.py. You can place the module in a directory called tk_message/ in your file system:

Following the same pattern as the web browser plugin, main() takes text and title. In this case, the function creates a Tk instance to hold the plugin’s top-level window. However, you don’t need to show that window, only a message box. So, you use .withdraw() to hide the root windows and then call .showinfo() on messagebox to show a dialog with the input text and title.

Now you need to pack each plugin into its own ZIP file. To do so, start a Python interactive session in the directory containing the web_message/ and tk_message/ folders and run the following code:

The next step is to create a root folder for your plugin system. This folder must contain a plugins/ directory with the newly created ZIP files in it. Here’s how your directory should look:

rp_plugins/ │ ├── plugins/ │ │ │ ├── tk_message.zip │ └── web_message.zip │ └── main.py

In main.py, you’ll place the client code for your plugin system. Go ahead and populate main.py with the following code:

Here’s how this code works line by line:

Lines 14 to 18 call load_plugins() to generate the current list of available plugins and execute them in a loop.

If you run the main.py script from your command line, then you first get a Tkinter message box displaying the Hello, World! message and the Greeting! title. After closing that window, your web browser will display the same message and title on a new page. Go ahead and give it a try!

Conclusion

Python can import code directly from ZIP files if they’re available in the module search path. This feature is known as Zip imports. You can take advantage of Zip imports to bundle modules and packages into a single archive so that you can distribute them to your end users quickly and efficiently.

You can also take advantage of Zip imports if you often get Python code bundled into ZIP files and need to use that code in your day-to-day tasks.

In this tutorial, you learned:

You also coded a hands-on example of how to build a minimal plugin system with zipimport. Through this example, you learned how to dynamically import code from ZIP files in Python.