[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces (original) (raw)

Stephen J. Turnbull stephen at xemacs.org
Fri Apr 24 19:25:03 CEST 2009


Paul Moore writes:

The pros for Martin's proposal are a uniform cross-platform interface, and a user-friendly API for the common case.

A more accurate phrasing would be "... a user-friendly API for those who feel very lucky today." Which is the common case, of course, but spins a little differently.

[1] Actually, all the PEP says is "With this PEP, a uniform treatment of these data as characters becomes possible." An argument as to why this is a good thing would be a useful addition to the PEP. At the moment it's more or less treated as self-evident

Well, the problem is that both parts are false. If you didn't start with a valid string in a known encoding, you shouldn't treat it as characters because it's not. Hand it to a careful API, and you'll get an Exception raised in your face. And that's precisely why it's not obviously a good thing. Careful clients will have to treat it as "transcoded bytes", and so the people who develop those clients get no benefit. OTOH, at least some of those who feel lucky and use it naively are going to turn out to be wrong.

That said, I'm +0 on the PEP as is. It's a little bit better than the current situation in that developers who would otherwise just punt on dealing with the other world (ie, Windows for Unix hackers, and Unix for Windows coders) will have a unified interface so it'll maybe work automagically (when you're luck :-) in that other world, too. And if somebody comes up with an idea of true genius for handling the underlying problem, or even just a slight practical improvement, then everybody who uses this API can benefit simply by upgrading Python.



More information about the Python-Dev mailing list