[Python-Dev] bytes (original) (raw)
[Python-Dev] bytes / unicode
Bill Janssen janssen at parc.com
Wed Jun 23 18:11:05 CEST 2010
- Previous message: [Python-Dev] bytes / unicode
- Next message: [Python-Dev] bytes / unicode
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Tres Seaver <tseaver at palladion.com> wrote:
Stephen J. Turnbull wrote:
> We do need str-based implementations of modules like urllib. Why would that be? URLs aren't text, and never will be. The fact that to the eye they may seem to be text-ish doesn't make them text. This
URLs are exactly text (strings, representable as Unicode strings in Py3K), and were designed as such from the start. The fact that some of the things tunneled or carried in URLs are string representations of non-string data shouldn't obscure that point. They're not "text-ish", they're text. They're not opaque, either; they break down in well-specified ways, mainly into strings.
The trouble comes in when we try to go beyond the spec, or handle things that don't conform to the spec. Sure, a path component of a URI might actually be a %-escaped sequence of arbitrary bytes, even bytes that don't represent a string in any known encoding, but that's only after reversing the %-escapes, which should happen in a scheme-specific piece of code, not in generic URL parsing or manipulation.
Bill
- Previous message: [Python-Dev] bytes / unicode
- Next message: [Python-Dev] bytes / unicode
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]