[3.6] bpo-36247: zipfile - extract truncates (existing) file when bad password provided (zip encryption weakness) by CristiFati · Pull Request #12242 · python/cpython (original) (raw)
As also specified in the issue, the details are on [SO]: zipfile.BadZipFile: Bad CRC-32 when extracting a password protected .zip & .zip goes corrupt on extract (@CristiFati's answer).
It's about Python 3.6, but it applies to any (actual) version.
This PR, attempts to fix the problem, and also provides a new test (for this specific scenario).
There's a great chance the problem affects a wider variety of scenarios, but tried to limit the fix as possible to this one (to avoid introducing regressions - maybe some other code relies on this behavior, although I doubt it).
Regarding the "xxx" password, I did some tests and this was the 1st one that reproduced it. I tried with generated one char passwords, but I didn't run into the problem. Anyway, it's a coincidence (and I guess, not so important).
Notes:
- There will be a performance decrease (when extracting password protected files), due to additional operations (especially renames - who work with disks). But since files should be on the same FS (partition), there should be only metadata changes
- A regression would be: when the file to be extracted fits on the disk (size-wise), but not together with the existing one. In this case extraction will fail (previously it would have succeeded), but the good thing is that the old one will still be preserver. User deleting the previous file would fix the problem