[Python-Dev] Python-3.0, unicode, and os.environ (original) (raw)

Nick Coghlan ncoghlan at gmail.com
Fri Dec 5 10:21:32 CET 2008


glyph at divmod.com wrote:

At least this time I think I've encapsulated pretty much my entire argument here, so if you don't buy it, we can probably just agree to disagree :).

Glyph, the only point I would add to your message is this one:

Adding a "blessed" way to encode arbitrary binary data into a Python 3.0 str object strikes me as giving up on one of the key advances in the new version of the language.

8-bit strings were a problem in Python 2.x because they blurred the boundary between arbitrary binary data and ASCII or latin-1 character data.

One of the most interesting aspects of Python 3.0 is its attempt to get developers to be explicit about this distinction (both in the code and in their own minds) by enforcing separation between arbitrary binary data (held in bytes and bytearray instances) and character data (held in str instances).

I don't understand how tunneling arbitrary binary data through str instances (regardless of encoding mechanism) can possibly fail to recreate exactly the same "is it text or binary data?" ambiguity problems that the str/bytes split is intended to eliminate. And if that happens, then what exactly was the point in moving to an all Unicode string model for Py3k?

Cheers, Nick.

-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia



More information about the Python-Dev mailing list