[Python-Dev] urlparse.urlunsplit should be smarter about + (original) (raw)
Stephen J. Turnbull stephen at xemacs.org
Sun May 9 14:15:38 CEST 2010
- Previous message: [Python-Dev] urlparse.urlunsplit should be smarter about +
- Next message: [Python-Dev] urlparse.urlunsplit should be smarter about +
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
John Arbash Meinel writes:
Stephen J. Turnbull wrote:
David Abrahams writes:
This is a bug report. bugs.python.org seems to be down.
from urlparse import * urlunsplit(urlsplit('git+file:///foo/bar/baz')) git+file:/foo/bar/baz
Note the dropped slashes after the colon.
That's clearly wrong, but what does "+" have to to do with it? AFAIK, the only thing special about + in scheme names is that it's not allowed as the first character.
Don't you need to register the "git+file:///" url for urlparse to properly split it?
if protocol not in urlparse.uses_netloc: urlparse.uses_netloc.append(protocol)
I don't know about the urlparse implementation, but from the point of view of the RFC I think not. Either BCP 35 or RFC 3986 (or maybe both) makes it plain that if the scheme name is followed by "://", the scheme is a hierarchical one. So that URL should parse with an empty authority, and be recomposed the same. I would do this by parsing 'git+file:///foo/bar/baz' to ('git+file', '', '/foo/bar/baz') or something like than, and 'git+file:/foo/bar/baz' to ('git+file', None, '/foo/bar/baz').
I don't see any reason why implementations should abbreviate the empty authority by removing the double slashes, unless specified in the scheme definition. Although my reading of RFC 3986 is that a missing authority (no "//") should be dereferenced in the same way as an empty one:
If the URI scheme defines a default for host, then that default
applies when the host subcomponent is undefined or when the
registered name is empty (zero length). (Sec. 3.2.2)
I don't see why urlparse should try to enforce that by converting from one to the other.
- Previous message: [Python-Dev] urlparse.urlunsplit should be smarter about +
- Next message: [Python-Dev] urlparse.urlunsplit should be smarter about +
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]