Issue 32959: zipimport fails when the ZIP archive contains more than 65535 files (original) (raw)

When trying to import a module from a ZIP archive containing more than 65535 files, the import process fails:

$ python3 -VV Python 3.6.4 (default, Jan 6 2018, 11:49:38) [GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)] $ cat create_zips.py from zipfile import ZipFile with ZipFile('below.zip', 'w') as zfp: for i in range(65535): zfp.writestr('m%d.py' % i, '') with ZipFile('over.zip', 'w') as zfp: for i in range(65536): zfp.writestr('m%d.py' % i, '') $ python3 create_zips.py $ python -m zipfile -l below.zip | (head -2 && tail -2) File Name Modified Size m0.py 2018-02-26 20:57:32 0 m65533.py 2018-02-26 20:57:36 0 m65534.py 2018-02-26 20:57:36 0 $ python -m zipfile -l over.zip | (head -2 && tail -2) File Name Modified Size m0.py 2018-02-26 20:57:36 0 m65534.py 2018-02-26 20:57:40 0 m65535.py 2018-02-26 20:57:40 0 $ PYTHONPATH=below.zip python3 -c 'import m0' $ PYTHONPATH=over.zip python3 -c 'import m0' Traceback (most recent call last): File "", line 1, in ModuleNotFoundError: No module named 'm0'

I think the problem is related to the zipimport module not handling the 'zip64 end of central directory record'.

FWIW, yes, this is because zipimport doesn't support ZIP64, and doesn't even flag it as an error when the ZIP requires it. Instead it skips files; the ZIP64 format works by setting the fields that would overflow to the maximum value as a signal that the real value should be read from the ZIP64 records, which means it's still a valid non-64-bit ZIP file, just a truncated one.

Adding ZIP64 support to zipimport is not trivial, and requires breaking some behaviour that's bad (not according to the ZIP spec) but actually covered by zipimport's tests: prepending arbitrary data to the ZIP file by just concatenating them together (rather than rewriting the ZIP archive's offsets). I don't think there's going to be an easy fix here, although I am working on open-sourcing a replacement zipimport that I wrote for work, which does support ZIP64 properly (and also fixes some other fundamental flaws in zipimport). I hope to have that done soon, and we can discuss replacing zipimport with that module in Python 3.8 or later.