[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces (original) (raw)
Stephen J. Turnbull stephen at xemacs.org
Fri Apr 24 20:40:12 CEST 2009
- Previous message: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
- Next message: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Antoine Pitrou writes:
Stephen J. Turnbull <stephen xemacs.org> writes:
Well, the problem is that both parts are false. If you didn't start with a valid string in a known encoding, you shouldn't treat it as characters because it's not. Hand it to a careful API, and you'll get an Exception raised in your face.
Which "careful API" are you talking about?
OTOH, at least some of those who feel lucky and use it naively are going to turn out to be wrong.
Why will they turn out to be wrong?
To quote the PEP:
""" While providing a uniform API to non-decodable bytes, this interface has the limitation that chosen representation only "works" if the data get converted back to bytes with the python-escape error handler also. Encoding the data with the locale's encoding and the (default) strict error handler will raise an exception, encoding them with UTF-8 will produce non-sensical data.
For most applications, we assume that they eventually pass data received from a system interface back into the same system interfaces. """
But you can't know that. These are now "just strings", which could end up in pickles and other persistent objects, be passed across network interfaces (remote copy, for example), etc, etc, and there is no way to guarantee that the recipient will understand the rules, unless the application encapsulates them in some kind of representation that says "I look like a Unicode but I'm really just encoded bytes." But the whole point is to turn them into plain old strings so people don't have to bother keeping track.
As I already said, this is no worse than the current situation, but it gives the impression that Python has a standard "solution". (Yes, I know Martin doesn't claim it's a solution to any of those problems. The point is user perception.)
I have to wonder whether having a standard way of not solving any problems is better than having no standard way of not solving any problems. It may be, and it probably can't hurt, which is why I'm +0.
- Previous message: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
- Next message: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]