[Python-Dev] Re: Regression in unicodestr.encode()? (original) (raw)

Fran�ois Pinard pinard@iro.umontreal.ca
10 Apr 2002 10:04:11 -0400


[jepler@unpythonic.dhs.org]

Why Python refuses to do it this way: for security reasons, the UTF-8 codec gives you an "illegal encoding" error in this case.

[...] I'm terribly glad that Python has gotten this detail right.

I'm also glad that Python did it right, not at all because of security reasons (these are debatable -- the trend is to see security holes everywhere in these days), but for better conformance with Unicode specifications.

Python being 8-bit clean, it is less a problem with it than with languages much relying on NUL terminated C strings. I hope that Python will stick to its current UTF-8 behaviour, even if C extension writers were applying some pressure for a change.

-- Fran�ois Pinard http://www.iro.umontreal.ca/~pinard