[Python-Dev] UCS2/UCS4 default (original) (raw)
Guido van Rossum guido at python.org
Wed Jul 2 20:47:02 CEST 2008
- Previous message: [Python-Dev] UCS2/UCS4 default
- Next message: [Python-Dev] UCS2/UCS4 default
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wed, Jul 2, 2008 at 11:35 AM, Jeroen Ruigrok van der Werven <asmodai at in-nomine.org> wrote:
-On [20080702 20:27], Guido van Rossum (guido at python.org) wrote:
I disagree. Instead, I would say that such code needs to be aware of surrogates. Just to make sure I understood you: Python's code needs to be made aware of surrogates?
No, Python already is aware of surrogates. I meant applications processing non-BMP text should beware of them.
If so, do you want me to log issues for the things encountered?
If you find places where the Python core or standard library is doing Unicode processing that would break when surrogates are present you should file a bug. However this does not mean that every bit of code that slices a string at an arbitrary point (and hence risks slicing in the middle of a surrogate) is incorrect -- it all depends on what is done next with the slice.
I'd also prefer to receive bug reports about breakages actually encountered in the wild than purely theoretical issues. And in all cases a fragment of test code to reproduce the problem would be appreciated.
-- Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai イェルーン ラウフロック ヴァン デル ウェルヴェン http://www.in-nomine.org/ | http://www.rangaku.org/ | GPG: 2EAC625B Learn from the past -- don't wear it like a yoke around your neck...
-- --Guido van Rossum (home page: http://www.python.org/~guido/)
- Previous message: [Python-Dev] UCS2/UCS4 default
- Next message: [Python-Dev] UCS2/UCS4 default
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]