Issue 7216: low performance of zipfile readline() (original) (raw)

The readline() function in zipfile (in ZipExtFile) reads chunks of max 100 bytes (zipfile.py, line 525) into the linebuffer. A file of 500 MBytes therefore yields 5 million chunks. Changing the value 100 to 10000 bytes boosts performance by magnitudes, while it only requires 10k of memory.

My fix in zipfile.py, line 525:

buf = self.read(min(size, 10000)) # was 100 before

Best regards / Volker Siepmann