Clarifications needed for the Collection construct (with CR) from Karsten Tolle on 2003-02-26 (www-rdf-comments@w3.org from January to March 2003) (original) (raw)

Hi, contains also a comment to: Issue #hendler-01 literals in rdf:parseType="Collection"

... thanks Pat for your input.

genID:1 rdf:type rdf:List . genID:1 rdf:first ex:aaa . genID:1 rdf:first ex:bbb . genID:1 rdf:rest ex:ccc . genID:1 rdf:rest genID:2 . genID:2 rdf:type rdf:List . genID:1 rdf:rest rdf:nil .

The question that arises, does it make any sense?

Yes, it does. If one were to assume (as for example OWL does) that rdf:first was a functional (unique-valued) property, then this graph would entail that ex:aaa = ex:bbb = ex:ccc (in OWL, this could be expressed by owl:sameIndividualAs).

Since neither equality nor functionality can be expressed in RDFS, this constraint doesn't amount to much in the RDF model theory; but as the spec points out, a semantic extension (like OWL) may impose further conditions on the RDF collection vocabulary.

OK, I think this entailment (ex:aaa = ex:bbb = ex:ccc ) should be stated in RDF Semantics (3.2.3) and in the RDF Primer too!

Having the mail from this archive: Issue #hendler-01 literals in rdf:parseType="Collection" in mind, it might cause additional problems. It then can result in something like: "Hello World" = ex:aaa A literal being equal to a resource!?!

There is also the possibility of a list node without rdf:first. The semantic is not clear to me, but I think this case is less problematic and should be handled by applications using the RDF graph.

For the usage of rdf:nil I still see problems. Using it as a bound, what does it mean to have more than one or none? ... see also my comments below.

What would it mean to have a collection element with different values?

They might not be different, see above. The use of different names does not entail that the values are different. This is one reason why there is little point in imposing 'wellformedness' conditions in RDF collections either on the syntax (they would be too strong, or else too complicated to be useful) or on the semantics (they would have no effect in RDF since they would have no expressible entailments.)

Would it not make more sense to enter a rdf:Bag instead? But there is also another question: Do we need the collection construct at all?

It was specifically requested by the Webont working group, as a necessary requirement for OWL. So the answer is yes.

Before there had been three kinds of containers, rdf:Bag, rdf:Seq and rdf:Alt. There are some differences between containers and a collection. A container in RDF is one resource containing all its members. The collection is different, there are many resources linked with each other. These resources are linked with their value(s) and the end of the collection is denoted by the empty list as the object for the rdf:rest property. Now here comes the main aim of this new construct: It defines a fixed finite list of items with a given length and terminated by rdf:nil, at least this is what we can read in [4] section 4.2. Reaching the goal? There is no restriction on the structure of lists in RDF. As shown there can be more than one rdf:rest, more than one rdf:first and even the existence of rdf:nil as the terminating object is nowhere forced.

But how could it be forced? RDF graphs cannot have global conditions imposed on them by the spec, since they may be formed in real time, by rather dumb software which simply collects triples from other places and mixes them together. RDF does not undertake to impose any global syntactic wellformedness conditions on graphs: the 'largest' syntactic unit in RDF is the triple, and a graph is simply a set of triples. The intention of the 'list' vocabulary however is that if the lists are 'well-formed' then they denote an appropriate sequence of items.

But there are already such conditions. E.g., a node with rdf:type=rdf:Statement must have exactly one connection by rdf:subject, rdf:object and rdf:predicate.

In the same way we could force a collection to have exactly one rdf:nil.

By default the collection is constructed with blank nodes

No, there is no such default. RDF/XML parsers will do this, but that is an XML matter.

but even this can be changed.

Example: A collection with non-blank node. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:ex="http://example.org/stuff/1.0/"> <rdf:Description rdf:about="http://example.org/basket"> <ex:hasFruit rdf:resource="myCollection"> <rdf:Description rdf:about="http://example.org/apple"/> <rdf:Description rdf:about="http://example.org/pear"/> <rdf:List rdf:ID="myCollection"> <rdf:first rdf:about="http://example.org/apple"/> <rdf:rest rdf:parseType="Collection"> <rdf:Description rdf:about="http://example.org/pear"/>

This example should generate the following triples:

<http://example.org/basket>http://example.org/basket ex:hasFruit ns1:myCollection . ns1:myCollection rdf:type rdf:List . ns1:myCollection rdf:first <http://example.org/apple>http://example.org/apple . ns1:myCollection rdf:rest genID:1 . genID:1 rdf:type rdf:List . genID:1 rdf:first <http://example.org/pear>http://example.org/pear . genID:1 rdf:rest rdf:nil .

The effect is that by entering a non-blank node someone could enter also to the collection construct elements from outside. This means without any restrictions this construct is not fixed!

Right, it is not. Nothing is 'fixed' in this sense in RDF. Bear in mind - its a centrally important point - that the RDF/XML notation is only an XML serialization syntax for the RDF graph. Any extra structure you might feel is 'natural' in the XML (eg the assumption that the listed elements of a container are the full complement of the members) is not significant in the RDF if it is not made explicit in the RDF graph itself. The relatively 'tight' syntactic form of the XML is potentially misleading if this point is not kept in mind.

It should be stated in the RDF Primer!

What about other relevant RDF constructs? In [4] the following is stated: A limitation of the containers is that there is no way to close them, i.e., to say, "these are all the members of the container". This is because, while one graph may describe some of the members, there is no way to exclude the possibility that there is another graph somewhere that describes additional members. But we can also use blank nodes to identify the rdf:Bag itself. Blank nodes #can not be referred from outside and therefore no further member can be added.

That is true so long as one only uses a blank node to refer to the container. But it is legal, and often useful, to refer to a container with a uriref. And in any case, the syntactic limitation is not itself a semantic licence to conclude that there are no other items in the container. In general, any RDF graph can only be expected to be a partial description of the domain being described, and this applies to containers as well as everything else.

It even needs less triples and the graph is more easy to read. The example of the fruit basket could be written as:

Example: The fruit basket using the bag construct. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:ex="http://example.org/stuff/1.0/"> <rdf:Description rdf:about="http://example.org/basket"> ex:hasFruit rdf:Bag <rdf:li rdf:resource="http://example.org/apple"/> <rdf:li rdf:resource="http://example.org/pear"/>

<http://example.org/basket>http://example.org/basket ex:hasFruit genID:1 . genID:1 rdf:type rdf:Bag . genID:1 rdf:_1 <http://example.org/apple>http://example.org/apple . genID:1 rdf:_2 <http://example.org/pear>http://example.org/pear .

Without restrictions on the collection construct it is just a more complex way of expressing things we already could express before using containers.

No, it allows you to positively assert that the collection is bounded (by the use of rdf:nil), which is impossible with RDF containers.

Since there can be multiple or none rdf:nil appear in a collection, it might be more effective to have an extra property denoting the length of a collection or container!?!

Possible restrictions can be:

  1. Each collection in RDF must have exactly one terminating rdf:nil element.
  2. Each collection element must have exactly one connection with the rdf:first property.
  3. Each collection element must have exactly one connection with the rdf:rest property.
  4. Collection elements in RDF have to be blank nodes.

It might be too restrictive to have all these restrictions

It is too restrictive, in my view, to have any of them as a global wellformedness condition on RDF graphs: to do so would require all conforming RDF engines to check these conditions every time a graph merge is performed.

and there also might be further reasons for introducing the collection construct.

The chief reason is that it was formally requested by another WG, so I suggest you take up this matter with them (http://lists.w3.org/Archives/Public/public-webont-comments/)

The main difference at the moment is that a container is one resource containing all values, while the collection contains different linked resources containing the values. In [1] we can find in the appendix A.3 that the collection construct was also introduced to support recursive processing in languages such as Prolog. There should not be a special construct for each programming language.

Additional question: What would be the fixed length of a collection? (Number of nodes of type rdf:List that are linked (minus rdf:nil nodes), the number of rdf:first connections?

The intended meaning is that it would be the number of non-nil nodes of type rdf:List.

What about multi sets in collections?)

Not sure what you mean.

Thanks for your very thorough and detailed comments, by the way.

Best wishes

Pat Hayes


Karsten Tolle

Received on Wednesday, 26 February 2003 08:29:51 UTC