Issue 10801: zipfile.ZipFile().extractall() header mismatch for non-ASCII characters (original) (raw)

Created on 2010-12-31 13:38 by M..Z., last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
bug_zipfile_extractall.zip M..Z.,2010-12-31 13:38 ZIP with three files that can reproduce the problem
zipfile.diff loewis,2010-12-31 14:25
issue10801_test.1.patch eli.bendersky,2011-01-01 07:56
Messages (18)
msg124964 - (view) Author: M. Zilmer (M..Z.) Date: 2010-12-31 13:38
Trying to unpack a ZIP file where some packet files contain danish letters results in: zipfile.BadZipFile: File name in directory 'filename_with_æoå.txt' and header b'filename_with_\x91o\x86.txt' differ. Using Py 3.2b2 on Win7. Unpack the attached ZIP file and run the Py script, which will show the problem using the enclosed two ZIP files.
msg124966 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2010-12-31 14:25
The attached patch fixes it for me. No time to write tests right now.
msg124978 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-12-31 21:34
FWIW, having just looked at related code in zipfile recently, this patch looks correct to me.
msg124991 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2011-01-01 05:12
I'll try to produce a test in the next hour or two
msg124992 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2011-01-01 07:54
I'm attaching a patch with a test for Martin's fix. I had trouble programmatically generating a "bad" zip for this bug, since it has different encodings for the header and filename (probably created by WinZip?). So I created a directory in test/ and placed the problematic zipfile M.Z. submitted in there, and wrote an appropriate test in test_zipfile.py I verified the test fails on py3k trunk before Martin's fix, and succeeds after it, both by running the test file directly and through regrtest. Note: Tested only on Ubuntu
msg124994 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2011-01-01 10:09
Committed patch and test in r87604.
msg124995 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2011-01-01 10:35
OK, looks like there is a problem on some buildbots: http://www.python.org/dev/buildbot/all/builders/AMD64%20Gentoo%20Wide%203.x/builds/863/steps/test/logs/stdio
msg124998 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2011-01-01 12:16
OK, I think r87606 fixed it: it doesn't extract the files, instead calls only open().
msg124999 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2011-01-01 12:44
Georg, did you figure out the root cause of the problem on that buildbot? Seeing it fails in open(targetpath, "wb"), extracting the file may have failed if the bot had no write permissions to the current directory, but the ascii encoding error is not what I'd expect in such a case.
msg125000 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2011-01-01 12:47
Well, it looks like the filesystem encoding is set to ASCII on these machines.
msg135838 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-05-12 14:35
Issue #12048 is a duplicate of this bug, but with Python 3.1. Should we backport the fix to Python 3.1?
msg136228 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-05-18 11:43
New changeset 1f0f0e317873 by Victor Stinner in branch '3.1': Backport commit 33543b4e0e5d from Python 3.2: #10801: In zipfile, support http://hg.python.org/cpython/rev/1f0f0e317873
msg136229 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-05-18 11:48
New changeset 243c78fbbb49 by Victor Stinner in branch '3.1': Ooops, add the missing file of the backport of commit 33543b4e0e5d from Python http://hg.python.org/cpython/rev/243c78fbbb49
msg136559 - (view) Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * (Python triager) Date: 2011-05-22 18:31
These changes cause test failure on 3.1 branch when verbose mode is disabled: # python3.1 -m test.regrtest test_zipfile test_zipfile test test_zipfile produced unexpected output: ********************************************************************** *** line 2 of actual output doesn't appear in expected output after line 1: + /usr/lib64/python3.1/test/zip_cp437_header.zip ********************************************************************** 1 test failed: test_zipfile
msg136566 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-05-22 20:13
New changeset 9ef8fc5454cb by Victor Stinner in branch '3.1': Issue #10801: Remove a debug print() from test_zipfile http://hg.python.org/cpython/rev/9ef8fc5454cb
msg136567 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-05-22 20:13
> These changes cause test failure on 3.1 branch when verbose mode is disabled What a shame! I commited a debug "print()": 1.39 + print(fname) It should be fixed by my last commit.
msg136682 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2011-05-23 17:04
Victor: you should have a look at <http://bitbucket.org/birkenfeld/hgcodesmell>.
msg138081 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-06-10 14:35
New changeset 33b7428e65b4 by Victor Stinner in branch '3.1': Issue #10801: Fix test_unicode_filenames() of test_zipfile http://hg.python.org/cpython/rev/33b7428e65b4
History
Date User Action Args
2022-04-11 14:57:10 admin set github: 55010
2011-06-10 14:35:35 python-dev set messages: +
2011-05-23 17:04:30 georg.brandl set messages: +
2011-05-22 20:13:48 vstinner set messages: +
2011-05-22 20:13:12 python-dev set messages: +
2011-05-22 18:31:36 Arfrever set nosy: + Arfrevermessages: +
2011-05-18 11:48:43 python-dev set messages: +
2011-05-18 11:43:29 python-dev set nosy: + python-devmessages: +
2011-05-12 14:35:47 vstinner set messages: +
2011-01-01 12:47:39 georg.brandl set nosy:loewis, georg.brandl, amaury.forgeotdarc, vstinner, r.david.murray, eli.bendersky, M..Z.messages: +
2011-01-01 12:44:57 eli.bendersky set nosy:loewis, georg.brandl, amaury.forgeotdarc, vstinner, r.david.murray, eli.bendersky, M..Z.messages: +
2011-01-01 12:16:56 georg.brandl set status: open -> closednosy:loewis, georg.brandl, amaury.forgeotdarc, vstinner, r.david.murray, eli.bendersky, M..Z.messages: +
2011-01-01 10:35:58 georg.brandl set status: closed -> opennosy:loewis, georg.brandl, amaury.forgeotdarc, vstinner, r.david.murray, eli.bendersky, M..Z.messages: +
2011-01-01 10:09:59 georg.brandl set status: open -> closednosy: + georg.brandlmessages: + resolution: accepted
2011-01-01 07:56:51 eli.bendersky set files: + issue10801_test.1.patchnosy:loewis, amaury.forgeotdarc, vstinner, r.david.murray, eli.bendersky, M..Z.
2011-01-01 07:56:35 eli.bendersky set files: - issue10801_test.1.patchnosy:loewis, amaury.forgeotdarc, vstinner, r.david.murray, eli.bendersky, M..Z.
2011-01-01 07:54:15 eli.bendersky set files: + issue10801_test.1.patchnosy:loewis, amaury.forgeotdarc, vstinner, r.david.murray, eli.bendersky, M..Z.messages: +
2011-01-01 05:12:14 eli.bendersky set nosy: + eli.benderskymessages: +
2010-12-31 21:34:04 r.david.murray set nosy: + r.david.murraymessages: +
2010-12-31 14:25:54 loewis set files: + zipfile.diffmessages: + keywords: + patchnosy:loewis, amaury.forgeotdarc, vstinner, M..Z.
2010-12-31 14:03:42 pitrou set nosy: + amaury.forgeotdarc, loewis, vstinner
2010-12-31 13:38:34 M..Z. create