n3 parser can assign repeated bnode id strings for separate bnodes · Issue #305 · RDFLib/rdflib (original) (raw)

I don't have time to make a good writeup right now, but there's a serious problem in rdflib/plugins/parsers/notation3.py where it will in some cases assign the same 'random' string to two bnodes, and then your data gets corrupt. Below shows my fix-- look for the triple ###.

A test case might try reading the same file with bnodes a few times, trying to make the parser objects land in the same place in memory.

changes in rdflib/plugins/parsers/notation3.py

from decimal import Decimal
from uuid import uuid4 ### add this

class Formula(object):
    number = 0

    def __init__(self, parent):
        self.uuid = uuid4().hex ### add this
        self.counter = 0
        Formula.number += 1
        self.number = Formula.number
        self.existentials = {}
        self.universals = {}

        self.quotedgraph = QuotedGraph(
            store=parent.store, identifier=self.id())

    def newBlankNode(self, uri=None, why=None):
        if uri is None:
            self.counter += 1
            bn = BNode('f%sb%s' % (self.uuid, self.counter)) ### critical patch
        else:
            bn = BNode(uri.split('#').pop().replace('_', 'b'))
        return bn