[Python-Dev] test_unicode_file failing on Mac OS X (original) (raw)

Jack Jansen Jack.Jansen at cwi.nl
Sun Dec 7 11:32:23 EST 2003


On 6-dec-03, at 18:48, Skip Montanaro wrote:

Two of the testunicodefile began failing on my Mac today (fresh cvs up, OS X 10.2.8, vanilla unix-style build):

====================================================================== FAIL: testdirectories (main.TestUnicodeFiles) ---------------------------------------------------------------------- Traceback (most recent call last): File "../Lib/test/testunicodefile.py", line 155, in testdirectories self.dodirectory(TESTFNENCODED+ext, TESTFNENCODED+ext, os.getcwd) File "../Lib/test/testunicodefile.py", line 103, in dodirectory makename) AssertionError: '@test-a\xcc\x80o\xcc\x80.dir' != '@test-\xc3\xa0\xc3\xb2.dir'

This is probably related to the two flavors of unicode there are, one which prefers to have all accents separately from the letters as much as possible and one which prefers the reverse. I keep forgetting the names of the two, they're somewhat silly.

But the problem is that Python prefers to represent the string "รค" as the two characters "a" and "umlaut on the previous char", and MacOSX prefers to represent the same string as "a with umlaut on it". Or the other way around, this is something else I always forget.

And while there are algorithms to convert the combined form of unicode to the uncombined form and vice versa there are no Python codecs to do this. The OSX system calls do the right thing (convert both forms to what it prefers), but when you do a readdir() you don't get the string back you put it.

Jack Jansen, <Jack.Jansen at cwi.nl>, http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman



More information about the Python-Dev mailing list