[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces (original) (raw)

Cameron Simpson cs at zip.com.au
Wed Apr 29 04:40:26 CEST 2009


On 28Apr2009 14:37, Thomas Breuel <tmbdev at gmail.com> wrote: | But the biggest problem with the proposal is that it isn't needed: if you | want to be able to turn arbitrary byte sequences into unicode strings and | back, just set your encoding to iso8859-15. That already works and it | doesn't require any changes.

No it doesn't. It does transcode without throwing exceptions. On POSIX. (On Windows? I doubt it - windows isn't using an 8-bit scheme. I believe.) But it utter destorys any hope of working in any other locale nicely. The PEP lets you work losslessly in other locales.

It may require some app care for particular very weird strings that don't come from the filesystem, but as far as I can see only in circumstances where such care would be needed anyway i.e. you've got to do special stuff for weirdness in the first place. Weird == "ill-formed unicode string" here.

Cheers,

Cameron Simpson <cs at zip.com.au> DoD#743 http://www.cskk.ezoshosting.com/cs/

I just kept it wide-open thinking it would correct itself. Then I ran out of talent. - C. Fittipaldi



More information about the Python-Dev mailing list