Issue 7216: low performance of zipfile readline() (original) (raw)
The readline() function in zipfile (in ZipExtFile) reads chunks of max 100 bytes (zipfile.py, line 525) into the linebuffer. A file of 500 MBytes therefore yields 5 million chunks. Changing the value 100 to 10000 bytes boosts performance by magnitudes, while it only requires 10k of memory.
My fix in zipfile.py, line 525:
buf = self.read(min(size, 10000)) # was 100 before
Best regards / Volker Siepmann