[Python-Dev] (Not) delaying the 3.2 release (original) (raw)
Guido van Rossum guido at python.org
Thu Sep 16 21:21:13 CEST 2010
- Previous message: [Python-Dev] (Not) delaying the 3.2 release
- Next message: [Python-Dev] (Not) delaying the 3.2 release
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Thu, Sep 16, 2010 at 11:16 AM, Toshio Kuratomi <a.badger at gmail.com> wrote:
On Thu, Sep 16, 2010 at 10:56:56AM -0700, Guido van Rossum wrote:
On Thu, Sep 16, 2010 at 10:46 AM, Martin (gzlist) <gzlist at googlemail.com> wrote: > On 16/09/2010, Guido van Rossum <guido at python.org> wrote: >> >> In all cases I can imagine where such polymorphic functions make >> sense, the necessary and sufficient assumption should be that the >> encoding is a superset of 7-bit(*) ASCII. This includes UTF-8, all >> Latin-N variant, and AFAIK also the popular CJK encodings other than >> UTF-16. This is the same assumption made by Python's byte type when >> you use "character-based" methods like lower(). > > Well, depends on what exactly you're doing, it's pretty easy to go wrong: > > Python 3.2a2+ (py3k, Sep 16 2010, 18:43:45) [MSC v.1500 32 bit (Intel)] on win32 > Type "help", "copyright", "credits" or "license" for more information. >>>> import os, sys >>>> os.path.split("C:\十") > ('C:\', '十') >>>> os.path.split("C:\十".encode(sys.getfilesystemencoding())) > (b'C:\\x8f', b'') > > Similar things can catch out web developers once they step outside the > percent encoding.
Well, that character is not 7-bit ASCII. Of course things will go wrong there. That's the whole point of what I said, isn't it? You were talking about encodings that were supersets of 7-bit ASCII. I think Martin was demonstrating a byte string that was a superset of 7-bit ASCII being fed to a stdlib function which went wrong.
Whoops, sorry. I don't have access to Windows so I can't reproduce this though. I also don't understand it. What is the Unicode codepoint for that 十 character? What is sys.getfilesystemencoding()? What is the value of "C:\十".encode(sys.getfilesystemencoding())?
-- --Guido van Rossum (python.org/~guido)
- Previous message: [Python-Dev] (Not) delaying the 3.2 release
- Next message: [Python-Dev] (Not) delaying the 3.2 release
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]