[Python-Dev] Bytes path support (original) (raw)
Nikolaus Rath Nikolaus at rath.org
Wed Aug 27 03:39:35 CEST 2014
- Previous message: [Python-Dev] Bytes path support
- Next message: [Python-Dev] Bytes path support
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Nick Coghlan <ncoghlan at gmail.com> writes:
As some examples of where bilingual computing breaks down:
* My NFS client and server may have different locale settings * My FTP client and server may have different locale settings * My SSH client and server may have different locale settings * I save a file locally and send it to someone with a different locale setting * I attempt to access a Windows share from a Linux client (or vice-versa) * I clone my POSIX hosted git or Mercurial repository on a Windows client * I have to connect my Linux client to a Windows Active Directory domain (or vice-versa) * I have to interoperate between native code and JVM code The entire computing industry is currently struggling with this monolingual (ASCII/Extended ASCII/EBCDIC/etc) -> bilingual (locale encoding/code pages) -> multilingual (Unicode) transition. It's been going on for decades, and it's still going to be quite some time before we're done. The POSIX world is slowly clawing its way towards a multilingual model that actually works: UTF-8 Windows (including the CLR) and the JVM adopted a different multilingual model, but still one that actually works: UTF-16-LE Nick, I think the first half of your post is one of the clearest expositions yet of 'why Python 3' (in particular, the str to unicode change). It is worthy of wider distribution and without much change, it would be a great blog post. Indeed, I had the same idea - I had been assuming users already understood this context, which is almost certainly an invalid assumption. The blog post version is already mostly written, but I ran out of weekend. Will hopefully finish it up and post it some time in the next few days :)
In that case, maybe it'd be nice to also explain why you use the term "bilingual" for codepage based encoding. At least to me, a codepage/locale is pretty monolingual, or alternatively covering a whole region (e.g. western europe). I figure with bilingual you mean ascii + something, but that's mostly a guess from my side.
Best, -Nikolaus
-- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
»Time flies like an arrow, fruit flies like a Banana.«
- Previous message: [Python-Dev] Bytes path support
- Next message: [Python-Dev] Bytes path support
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]