aggregate operators COUNT and SAMPLE should ignore NULL values · Issue #563 · RDFLib/rdflib (original) (raw)

Consider the following query:

  SELECT ?x (COUNT(?y) as ?ys) (COUNT(?z) as ?zs) WHERE {
    VALUES (?x ?y ?z) {
      (2 6 UNDEF)
      (2 UNDEF 10)
      (3 UNDEF 15)
      (3 9 UNDEF)
    }
  }
  GROUP BY ?x

it should return the following tuples:

as, per the specification:

[COUNT] counts the number of times a given expression has a bound, and non-error value

But instead it returns the following tuples:

There is a similar problem with the SAMPLE operator.
I would expect that:

  SELECT ?x (SAMPLE(?y) as ?ys) (SAMPLE(?z) as ?zs) WHERE {
    VALUES (?x ?y ?z) {
      (2 6 UNDEF)
      (2 UNDEF 10)
      (3 UNDEF 15)
      (3 9 UNDEF)
    }
  }
  GROUP BY ?x

return the following tuples:

but instead I get

(where _ means NULL).

Here the specification is not as explicit as how to handle NULL values,
but both Virtuoso and Corese give me the expected result, so there seem to be a consensus on the fact that SAMPLE should not return NULL values.

(in fact, when one sampled column contains only NULL values, both Virtuoso and Corese populate it with an artificial 0 value).