Serialized turtle file fails parsing · Issue #345 · RDFLib/rdflib (original) (raw)
I am exploring migrating from rdflib-3.2.3 to rdflib-4.0.1 in our project. During the testing I noticed an issue where in if I serialize a graph into turtle format and then try to reload it will fail during the parsing stage. I figured this was because of auto generation of prefixes like "ns1", "ns2". I have a small test case that one can use to test this behavior in rdflib-4.0.1.
>>> import rdflib
>>> inp_data = """
... <http://www.example.com/foo/user.id?id=abcdefgh> a <http://www.example.com/foo#user.id>;
... <http://www.example.com/bar#date.dob> "01/01/2001" ;
... <http://www.example.com/bar#date.start> "10/10/2010" ;
... <http://www.example.com/bar#date.end> "" ;
... <http://www.example.com/bar#emp.room> "room 45";
... <http://www.example.com/foo#type> "temp" .
... """
>>> g = rdflib.Graph().parse(data=inp_data, format="turtle")
>>> new_data = g.serialize(format="turtle")
>>> new_g = rdflib.Graph().parse(data=new_data, format="turtle")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/bharath/rdflib-4.0.1/lib/python2.7/site-packages/rdflib/graph.py", line 1002, in parse
parser.parse(source, self, **args)
File "/home/bharath/rdflib-4.0.1/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 1739, in parse
p.loadStream(source.getByteStream())
File "/home/bharath/rdflib-4.0.1/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 398, in loadStream
return self.loadBuf(stream.read()) # Not ideal
File "/home/bharath/rdflib-4.0.1/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 404, in loadBuf
self.feed(buf)
File "/home/bharath/rdflib-4.0.1/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 430, in feed
i = self.directiveOrStatement(s, j)
File "/home/bharath/rdflib-4.0.1/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 447, in directiveOrStatement
j = self.statement(argstr, i)
File "/home/bharath/rdflib-4.0.1/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 608, in statement
j = self.property_list(argstr, i, r[0])
File "/home/bharath/rdflib-4.0.1/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 975, in property_list
i = self.objectList(argstr, j, objs)
File "/home/bharath/rdflib-4.0.1/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 1027, in objectList
i = self.object(argstr, i, res)
File "/home/bharath/rdflib-4.0.1/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 1259, in object
j = self.subject(argstr, i, res)
File "/home/bharath/rdflib-4.0.1/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 617, in subject
return self.item(argstr, i, res)
File "/home/bharath/rdflib-4.0.1/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 703, in item
return self.path(argstr, i, res)
File "/home/bharath/rdflib-4.0.1/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 733, in path
"EOF found in middle of path syntax")
rdflib.plugins.parsers.notation3.BadSyntax: at line 8 of <>:
Bad syntax (EOF found in middle of path syntax) at ^ in:
"@prefix ns1: <http://www.example.com/foo#> .
@prefix ns2: <http://www.example.com/bar#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
<http://www.example.com/foo/user.id?id=abcdefgh> a ns1:user.id ;
ns2:date.dob "01/01/2001" ;
ns2:date.end "" ;
ns2:date.start "10/10/2010" ;
ns2:emp.room "room 45" ;
ns1:type "temp" .
^..."