XML literals, canonical form, and normal form C problem (original) (raw)

It appears to me that there is yet another problem with XML Literals. My reading of the canonicalization documents (Canonical XML, Version 1.0 and Exclusive XML Canonicalization, Version 1.0) indicates to me that there are canonicalized documents that have text that cannot be adquately captured by Unicode strings in Normal Form C.

Consider, for example, the following XML document (rendered in ASCII, where #xhh represents a UTF octet)

u#xCC#x88

I believe that its Exclusive Canonical Form (rendered in ASCII, where #xhh is as above) is

u#xCC#x88

I thus do not see how

rdf:RDF rdf:Description u#xCC#x88

is to be translated into an RDF graph.

Peter F. Patel-Schneider Bell Labs Research Lucent Technologies

Received on Thursday, 7 August 2003 11:03:49 UTC