[Python-Dev] bytes (original) (raw)

[Python-Dev] bytes / unicode

James Y Knight foom at fuhm.net
Tue Jun 22 20:07:18 CEST 2010

Previous message: [Python-Dev] bytes / unicode
Next message: [Python-Dev] bytes / unicode
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Jun 22, 2010, at 1:03 PM, Ian Bicking wrote:

Similarly I'd expect (from experience) that a programmer using Python to want to take the same approach, sticking with unencoded data in nearly all situations.

Yeah. This is a real issue I have with the direction Python3 went: it
pushes you into decoding everything to unicode early, even when you
don't care -- all you really wanted to do is pass it from one API to
another, with some well-defined transformations, which don't actually
depend on it having being decoded properly. (For example, extracting
the path from the URL and attempting to open it as a file on the
filesystem.)

This means that Python3 programs can become more fragile in the face
of random data you encounter out in the real world, rather than less
fragile, which was the goal of the whole exercise.

The surrogateescape method is a nice workaround for this, but I can't
help thinking that it might've been better to just treat stuff as
possibly-invalid-but-probably-utf8 byte-strings from input, through
processing, to output. It seems kinda too late for that, though: next
time someone designs a language, they can try that. :)

James -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20100622/7f537c1e/attachment-0001.html>

Previous message: [Python-Dev] bytes / unicode
Next message: [Python-Dev] bytes / unicode
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list