[Python-Dev] unicode_internal codec and the PEP 393 (original) (raw)

Victor Stinner victor.stinner at haypocalc.com
Wed Nov 9 11:14:50 CET 2011


Hi,

The unicode_internal decoder doesn't decode surrogate pairs and so test_unicode.UnicodeTest.test_codecs() is failing on Windows (16-bit wchar_t). I don't know if this codec is still revelant with the PEP 393 because the internal representation is now depending on the maximum character (Py_UCS1*, Py_UCS2* or Py_UCS4*), whereas it was a fixed size with Python <= 3.2 (Py_UNICODE*).

Should we:

?

The failure on Windows:

FAIL: test_codecs (test.test_unicode.UnicodeTest)

Traceback (most recent call last): File "D:\Buildslave\3.x.moore-windows\build\lib\test\test_unicode.py", line 1408, in test_codecs self.assertEqual(str(u.encode(encoding),encoding), u) AssertionError: '\ud800\udc01\ud840\udc02\ud880\udc03\ud8c0\udc04\ud900\udc05' != '\U00030003\U00040004\U00050005'

Victor



More information about the Python-Dev mailing list