msg52744 - (view) |
Author: Alexey Borzenkov (snaury) |
Date: 2007-06-10 10:53 |
This patch fixes UnicodeDecodeError when attempting to write files to zipfile with filename of unicode class. |
|
|
msg52745 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2007-06-10 16:48 |
This patch is incorrect. It relies on the system encoding, and allows non-string things as file names. What it really should do is to encode in code page 437; bonus points if it falls back to the UTF-8 feature of zip files when that encoding fails. |
|
|
msg52746 - (view) |
Author: Alexey Borzenkov (snaury) |
Date: 2007-06-10 20:29 |
File Added: python-zipfile-unicode-filenames-utf8.patch |
|
|
msg52747 - (view) |
Author: Alexey Borzenkov (snaury) |
Date: 2007-06-11 04:22 |
File Added: python-zipfile-unicode-filenames-utf8-2.patch |
|
|
msg52748 - (view) |
Author: Alexey Borzenkov (snaury) |
Date: 2007-06-11 04:27 |
File Added: python-zipfile-unicode-filenames-utf8-3.patch |
|
|
msg65935 - (view) |
Author: Christophe Kalt (kalt) |
Date: 2008-04-28 21:32 |
Any chance of this making it in sometime? The current behaviour is rather limiting/annoying. |
|
|
msg65939 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2008-04-28 22:13 |
> Any chance of this making it in sometime? I'll see what I can do for 2.6, but perhaps it gets delayed until 2.7/3.1. |
|
|
msg66274 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2008-05-05 17:18 |
Thanks for the patch, committed as r62724. I didn't see the need to clear the UTF-8 flag, so I left it in (in case somebody wants to inspect it). |
|
|
msg66277 - (view) |
Author: Alexey Borzenkov (snaury) |
Date: 2008-05-05 18:40 |
Martin, I cleared the flag bit because filename was changed in-place, to mark that filename does not need further processing. This was primarily compatibility concern, to accommodate for situations where users try to do such decoding in their own code (this way flag won't be there, so their code won't trigger). Without clearing the flag bit, calling _decodeFilenameFlags second time will fail, as well as any similar user code. I suggest that if users want to know if filename is unicode, they should check that filename is of class unicode. |
|
|
msg66289 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2008-05-05 21:15 |
> Martin, I cleared the flag bit because filename was changed in-place, to > mark that filename does not need further processing. This was primarily > compatibility concern, to accommodate for situations where users try to > do such decoding in their own code (this way flag won't be there, so > their code won't trigger). Without clearing the flag bit, calling > _decodeFilenameFlags second time will fail, as well as any similar user > code. I'm not concerned about the compatibility; code that actually does the decoding still might break since it would expect the filename to be a byte string if it doesn't explicitly decode. Such assumption would still break under your change. I am concerned about silently faking data. The library shouldn't do that; it should present the flags unmodified, as some application might perform further processing (such as displaying the flags to the user). It would then be confusing if the data processed isn't the one that was read from disk. > I suggest that if users want to know if filename is unicode, they should > check that filename is of class unicode. That won't work in Py3k, which will always decode the filename. |
|
|