Python 2.6.1 (r261:67515, Jul 7 2009, 23:51:51) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> from urlparse import urljoin >>> urljoin("http://", "somedomain.com") 'http:///somedomain.com' Note the three leading slashes, should be "http://somedomain.com"
urlparse.urljoin() is meant to join together a base URL with other URL parts. The protocol is part of the base URL and thus not supported by urlparse.urljoin().
http: and http:// are both valid base URIs; see RFC3986. More to the point, it's a useful thing to use a scheme as a base URI; many users omit the HTTP:// from their URIs.
>>> urljoin("http://", "//somedomain.com") results in "http://somedomain.com" So, I wonder if this way to specify the relative url properly and not the base-url. The test suite of urlparse tries to follow all the advertised scenarios for RFC3986 and also some more tests (which are usually discovered by de-facto scenarios of how other clients (mainly browsers) deal with it. If there is a browser behavior which we should emulate, without breaking existing code, we should consider this, otherwise we could leave this in invalid state. Thanks!