SPARQL: COUNT DISTINCT not working properly · Issue #404 · RDFLib/rdflib (original) (raw)

The combination of COUNT with DISTINCT is not working propetry in the RDFLib implementation of SPARQL.
Consider the following code:

from rdflib import *

g = Graph() g.parse(format="turtle", publicID="http://example.org/", data=""" @prefix : <> .

<#a> :knows <#b>, <#c> ; :age 42 .

<#b> :knows <#a>, <#c> ; :age 36 .

<#c> :knows <#b>, <#c> ; :age 20 .

""")

print "Query 1: people knowing someone younger" results = g.query(""" PREFIX : http://example.org/

SELECT DISTINCT ?x { ?x :age ?ax ; :knows [ :age ?ay ]. FILTER( ?ax > ?ay ) } """) for i in results: print str(i[0])

print "\nQuery 2: count people knowing someone younger" results = g.query(""" PREFIX : http://example.org/

SELECT (COUNT(DISTINCT ?x) as ?cx) { ?x :age ?ax ; :knows [ :age ?ay ]. FILTER( ?ax > ?ay ) } """)

for i in results: print str(i[0])

It produces the following output:

Query 1: people knowing someone younger
http://example.org/#a
http://example.org/#b

Query 2: count people knowing someone younger
3

while the result of Query 2 should obvioulsy be 2. It seems that the DISTINCT in Query 2 has no effect (indeed, <#a> matches the query twice, as she knows 2 people younger than herself).