Relative URIs are resolved incorrectly after redirects · Issue #130 · RDFLib/rdflib (original) (raw)

vfaronov, 2010-09-10T23:23:28.000Z

What steps will reproduce the problem?

  1. Prepare a resource http://example.org/foo serving up an RDF description that contains relative URIs, for example <#frag1>.
  2. Prepare a resource http://example.org/bar that redirects (for example, HTTP 301) to http://example.org/foo.
  3. Use RDFLib's Graph.parse() to parse http://example.org/bar.

What is the expected output? What do you see instead?

I expect the "real" URI http://example.org/foo to be used as the base URI, giving absolute URIs of the form http://example.org/foo#frag1. Instead, RDFLib uses the original requested URI http://example.org/bar as the base, giving http://example.org/bar#frag1.

What version of the product are you using? On what operating system?

RDFLib trunk (r1895) on GNU/Linux.

Please provide any additional information below.

RFC 3986 Uniform Resource Identifier (URI): Generic Syntax
http://tools.ietf.org/html/rfc3986#section-5.1.3

"Note that if the retrieval was the result of a redirected request, the last URI used (i.e., the URI that resulted in the actual retrieval of the representation) is the base URI."

Comment 1 by vfaronov

For a working example, see

 <http://linked-data.ru/example>

which 301s to (RDFa).

<http://linked-data.ru/example/>

Comment 2 by vfaronov

First attempt at a patch.
This changes the base URI resolution logic a bit, and I'm not 100% sure it doesn't break anything.

Index: rdflib/parser.py

--- rdflib/parser.py (revision 1895) +++ rdflib/parser.py (working copy) @@ -94,9 +94,11 @@ except HTTPError, e: # TODO: raise Exception('"%s" while trying to open "%s"' % (e, self.url))

@@ -147,6 +149,8 @@ else: raise Exception("Unexpected type '%s' for source '%s'" % (type(source), source))

@@ -155,7 +159,6 @@ file = builtin.file(filename, "rb") else: input_source = URLInputSource(absolute_location, format)

@@ -168,13 +171,11 @@ if input_source is None: raise Exception("could not create InputSource") else: - if publicID: + if publicID is not None: input_source.setPublicId(publicID)