[Python-Dev] str object going in Py3K (original) (raw)

James Y Knight foom at fuhm.net
Wed Feb 15 17:48:18 CET 2006

Previous message: [Python-Dev] str object going in Py3K
Next message: [Python-Dev] str object going in Py3K
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Feb 15, 2006, at 7:19 AM, Fuzzyman wrote:

[snip..]

I personally like the move towards all unicode strings, basically any text where you don't know the encoding used is 'random binary data'. This works fine, so long as you are in control of the text source. However, it leaves the following problem : The current situation (treating byte-sequences as text and assuming they are an ascii-superset encoded text-string) works (albeit with many breakages), simply because this assumption is usually correct. Forcing the programmer to be aware of encodings, also pushes the same requirement onto the user (who is often the source of the text in question). Currently you can read a text file and process it - making sure that any changes/requirements only use ascii characters. It therefore doesn't matter what 8 bit ascii-superset encoding is used in the original. If you force the programmer to specify the encoding in order to read the file, they would have to pass that requirement onto their user. Their user is even less likely to be encoding aware than the programmer.

Or the programmer can just use "iso-8859-1" and call it done. That
will get you the same "I don't care" behavior as now.

James

Previous message: [Python-Dev] str object going in Py3K
Next message: [Python-Dev] str object going in Py3K
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list