XPointer xpointer() Scheme (original) (raw)

1 Introduction

The XPointer xpointer() scheme is intended to be used with the XPointer Framework[XPtrFrame] to provide a high level of functionality for addressing portions of XML documents. It is based on XPath [XPath], and adds the ability to address strings, points, and ranges in accordance with definitions provided in DOM 2: Range.[DOM2] This scheme supports addressing into the internal structures of XML documents and external parsed entities. It allows for examination of a document's hierarchical structure and choice of portions based on various properties, such as element types, attribute values, character content, and relative position. In particular, it provides for specific reference to elements, character strings, and other XML information, whether or not they bear an explicit ID attribute.

The xpointer() scheme is built on top of the XML Path Language [XPath], which is a joint expression language also underlying the XSL Transformations (XSLT) language. The xpointer() scheme's extensions to XPath add the ability to identify locations that are not single, whole elements (such as those corresponding to typical selections and selection points in some user interfaces), and to combine string matching with the other location methods provided.

The xpointer() scheme does not cover addressing into the internal structures of DTDs or the XML declaration.

1.1 Origin and Goals

In addition to XPath, a number of prior systems and standards have helped guide the development of this specification; these are listed in the non-normative references section.A.2 Non-Normative References See the XPointer Requirements Document [XPREQ] for a thorough explanation of requirements for the design of the xpointer() scheme.

1.2 Notation and Document Conventions

[Definition: The key words must, must not, required, shall, shall not, should, should not, recommended, may, and optional in this specification are to be interpreted as described in [RFC 2119].]

The terms pointer, pointer part, scheme, XPointer processor, application, error, failure, and namespace binding context are used in this specification as defined in the XPointer Framework specification. Note that errors defined by this specification are distinct from XPointer Framework errors.

The formal grammar for the xpointer() scheme is given using simple Extended Backus-Naur Form (EBNF) notation, as described in the XML Recommendation [XML].

The prototypes for xpointer() scheme functions are given using the same notation used in the [XPath] Recommendation.

This specification explicitly extends some aspects of the syntax and semantics of XPath (mainly in relation to support for locations other than whole nodes). Except in such cases, [XPath] constructs and definitions remain in effect in the xpointer() scheme.

2 Terms and Concepts

Some special terms are defined here in order to clarify their relationship to similar terms used in the technologies on which the xpointer() scheme is based. Additional terms specific to the xpointer() scheme are defined in the flow of the text. Refer to [XPath], [DOM2], [Infoset], and [RFC 2396] for definitions of other technical terms used in this specification.

point

A location in an XML Information Set with no content or children. For example, the location between two adjacent nodes, or after a particular character within a text node. This notion is defined fully later (see point), and comes from the DOM Level 2 [DOM2] specification's notion of positions; this specification refers to such positions by the term "point" to avoid confusion with XPath positions.

range

An identification of all the XML Information Set content between a pair of points. This notion is defined fully later (see range), and comes from the DOM Level 2 [DOM2] specification.

[Definition: location]

A generalization of XPath's node that includes points and ranges in addition to XPath nodes (which include the 7 node types defined by the XML Information Set.[Infoset]

[Definition: location-set]

An unordered list of locations, such as produced by an xpointer() scheme expression. This corresponds to the node-set that is produced by XPath expressions, except for the generalization to include points and ranges. Just as for an XPath node-set, a location-set is unordered, but can be treated as having a specific order depending on the axis that is operating on it. In this specification, the ordering depends on the notion of document order defined in4.4.5 Document order, which applies to point and range locations as well as nodes, rather than on XPath's treatment of document order for nodes.

3 Conformance

Conforming XPointer processors claiming to support the xpointer() scheme must conform to the behavior defined in this specification and may conform to additional XPointer scheme specifications.

This specification is intended for use with the XPointer Framework [XPtrFrame] specification, and thus conforming XPointer processors must conform to the requirements of the XPointer Framework.

This specification normatively refers to the XPath [XPath] Recommendation, and conforming XPointer processors must therefore conform to the requirements of XPath except as this specification modifies them.

This specification also normatively uses the XPointer xmlns() scheme specification [XPtr-xmlns]; XPointer processors claiming to conform to this specification must also conform to the xmlns() specification.

Scheme data for the xpointer() scheme conforms to this specification if it does not cause an error as described in this specification.

Should need arise to refer to the namespace for objects defined by this specification, the normative namespace URI for the xpointer() scheme is http://www.w3.org/2001/05/XPointer.

4 Language and Processing

XPath expressions work with a data set that is derived from the elements and other markup constructs of an XML document. The xpointer() scheme model augments this data set. Both xpointer() expressions and XPath expressions operate by selecting portions of such data sets, often by their structural relationship to other parts (for example, the parent of a node with a certain ID value). The xpointer(), like XPath, uses iterative selections, each operating on what is found by the prior one.

Selection of portions of the information hierarchy is done through three main constructs: axes, predicates, and functions (constructs defined in XPath [XPath] ). An axis defines a sequence of candidates that might be located; predicates then test for various criteria relative to such portions; and functions generate new candidates or perform various other tasks. For example, an expression can identify certain elements from among the siblings of some previously located element, based on whether those sibling elements have an attribute with a certain value or are of a certain type such as "footnote". Another expression could identify the point location immediately preceding a certain element (which in turn was identified by ID or other tests).

4.1 Syntax

This section describes the syntax and semantics of the xpointer() scheme and the behavior of XPointer processors with respect to this scheme.

The scheme name is "xpointer". If scheme data in a pointer part with the xpointer() scheme does not conform to the syntax defined in this section, it is an error and the pointer part fails.

xpointer() Scheme Syntax
[1] xpointerschemedata ::= Expr

Expr is as defined in the XPath Recommendation [XPath], with the extensions defined in this specification.

4.2 Additions to XPath Terms and Concepts

The xpointer() scheme extends XPath by adding the following:

XPath provides for locating any subset of the nodes in an XML document or external parsed entities. XPath functionality, such as filtering an axis output by predicate, is generally defined in terms of operations on nodes and node-sets. As noted earlier, the xpointer() scheme also identifies locations that are points and ranges. For example, a range could extend from the middle of one paragraph to the middle of the next, thus containing only part of the relevant paragraphs and text nodes.

Note:

The order of a location's characters as displayed on a computer screen might not reflect their order in the underlying XML document. For example, this may occur when a portion of a right-to-left language such as Arabic is embedded in a left-to-right language such as French. For expressions that identify ranges of strings, the document order is used, not the display order. Thus, an expression for a single range might be displayed non-contiguously, and conversely a user selection of an apparent single range might correspond to multiple non-contiguous ranges in the underlying document.

4.3 Evaluation Context Initialization

An xpointer() scheme expression in a pointer part (as defined in the XPointer Framework[XPtrFrame]) is evaluated to yield an object of type location-set. This evaluation is carried out within a context similar to the XPath evaluation context except for the generalization of nodes to locations. XPointer processors must initialize this evaluation context to include the following information before evaluating an expression:

4.4 The point and range Location Types

For non-node locations, point and range locations can appear in the location sets identified by expression of the xpointer() scheme. This section defines these types and their characteristics required for XPath interoperability. Locations that are also nodes, have the same characteristics as XPath nodes.

Note:

Unlike DOM Level 2, which is based on UTF-16 units, XPath and the xpointer() scheme are based on UCS characters. So while the concepts of points and ranges are based on the notions of positions and ranges, there are differences in detail. For example, a sequence which in DOM counts as two characters might count in the xpointer() scheme as one character.

Points and ranges can be used as context locations in the xpointer() scheme. This allows the [] operator to be used to select locations from location sets in general.

The range-to function may be applied with a context location of any location type, and identifies a range whose start-point is start-point of the context location, and whose end-point is the end-point of the location identified by the function's argument.

The local-name,namespace-uri, andname functions operate on the firstlocation in document order, not the first location which is also a node.

4.4.1 Definition of Point Location

[Definition: A location of type point is defined in terms of two data items:] The first is the [Definition: container node, which is that node that directly contains the point.] For example, a point between two adjacent characters within a text node will be the text node. The second is the [Definition: index], which is a non-negative integer that represents the offset of the point among the child nodes or the character content of the container node (each node type can have only one or the other). An index of zero indicates the point before any child nodes or contained characters, and a non-zero index n indicates the point immediately after the nth child node or character.

Note:

The zero-based counting of node-points is compatible with that of DOM 2[DOM2], and therefore differs from the one-based counting used for XPath string functions such asstring-range that are available in the xpointer() scheme.

As defined, points are sufficient to identify the location preceding or following any individual character, or preceding or following any node in the data set constructed from an XML document or external parsed entity.

Given this definition, two points are necessarily identical if they have the same container node and index.

Note:

This specification does not constrain the implementation of points; XPointer processors need not actually represent points using data structures consisting of a node and an index.

Also note that while some nodes have explicit boundaries (such as element start-tags and end-tags), the boundaries of text nodes are implicit. Applications that present a graphical user interface for the selection or rendering of points and ranges need to take into consideration the fact that some points that might not be distinguished in the user interface, such as the points just inside and just outside the closing boundary of a text node inside an element, are in fact distinct.

A point location does not have an expanded-name.

The string-value of a point location is empty.

The axes of a point location are defined as follows:

4.4.2 Definition of Range Location

A location of type [Definition: range is defined by two points], a [Definition: start point] and an [Definition: end point]. A range represents all of the XML structure and content between the start point and end point. This is distinct from any list of nodes and/or characters, in part because some nodes might be only partly included. The start point and end point of a range mustbe in the same document or external parsed entity. The start point must not appear after the end point in document order (see 4.4.5 Document order).

[Definition: A range whose start point and end point are equal is a **collapsed range.**]

If the container node of one point of a range is a node of a type other than element, text, or root, the container node of the other point of the range must be the same node. For example, it is allowed to specify a range from immediately before a processing instruction to the end of an element, but not to specify a range from text inside a processing instruction to text outside it.

A range location does not have an expanded-name.

The string-value of a range location consists of the characters that are in text nodes and that are between the start point and end point of the range.

The axes of a range location are identical to the axes of its start point. For example, the parent axis of a range contains the parent of the start point of the range.

4.4.3 Covering Ranges for All Location Types

[Definition: A **covering range** is a range that wholly and exactly encompasses a location. The covering range can be identified by applying the covering-range function to any location. The equivalent covering range for each type of location is defined as follows:]

4.4.4 Tests for point and range Locations

The xpointer() scheme extends the XPath production for NodeType... by adding items for the point and range location types. The production (number 38 in XPath) becomes as follows:

NodeType
[2] NodeType ::= 'comment'
| 'text'
| 'processing-instruction'
| 'node'
| 'point'
| 'range'

This definition allows NodeTests to select locations of type point and range from a location-set that might include locations of many types.

4.4.5 Document order

XPointer must be able to represent locations that are entire element nodes, like XPath; but also locations that are not. For example, an edit insertion point does not correspond to any whole node, but rather to a zero-size location, such as the point immediately preceding a character of text within a text node or between two adjacent element nodes. Similarly, typical drag-selections in various applications correspond to ranges, not nodes.

As in DOM 2: Range[DOM2], a point is determined by the node that contains it, and the offset of the point within that node. A range is determined by its starting and ending points.

The appendix "On points and ranges" provides a simple notation that is similar in syntax to the XPointer element() scheme, but that is not limited to identifying whole elements. That notation clarifies the definition of document ordering given here and has useful additional properties, but ordering can be implemented using that or any other representation.

The diagram below shows the numbering of nodes and points in a graphic representation of an XML Information Set.

Sample XML tree, with nodes and inter-node points numbered.

Figure 1: Numbering of nodes and points

For example,

Intuitively, points are ordered largely as one would expect from a pre-order traversal of the document, or the XML stream order.

More formally, any two document locations that are comparable, regardless of which type(s) they are, can be compared by comparing their covering ranges. A comparison of ranges is defined purely by a sequence of comparisons of their starting and ending points. Because all comparisons thus reduce to comparisons of points, point comparison is defined first below. The sequence of point comparisons required to compare two ranges is defined second, and is sufficient for comparing any locations that can be compared.

Because text nodes are not explicitly represented in XML documents, the point immediately before (or after) a text node occurs at the same place in an XML source document as the point immediately before (or after) that text node's first (or last) character. This specification defines those locations as distinct; they do not compare as equal. For example, point(1.2) is not equal to point(1/3.0).

Because the attribute and namespace nodes of a given element node are unordered, it is illegal to compare them in certain ways, and the result of such an attempted comparison is undefined. Specifically:

Points within the same container node compare as do the values of their respective offsets.

Points within different nodes cannot be correctly ordered merely by comparing the order of their container nodes. For example, the later point could be late in a large node N, while the earlier point could be within an earlier child M of N. In that case node N is before M (because it started earlier), and so merely comparing M and N would produce the incorrect result.

To compare any comparable points P1 and P2:

point(.0) compares as before any other point; point(.1) compares as after any other point.

  1. Define Node1 as the child sequence of the node directly containing point P1, and Node2 as the child sequence of the node directly containing point P2.
  2. Define Offset1 as the offset of point P1 within the node identified by Node1, and Offset2 as the offset of point P2 within the node identified by Node2.
  3. Beginning at the uppermost end of Node1 and Node2 (typically both 1 for the XML document element), compare corresponding components of the two paths and discard all such pairs that are equal.
  4. If a further component(s) is available from neither Node1 nor Node2, the points are directly within the same containing node, and are ordered simply by their respective offsets.
  5. If a further component(s) is available from both Node1 and Node2, then the node whose next component value is greater represents the point that is later in document order.
  6. If a further component(s) is available for only Node1, then P1 is within some descendant of the node in which P2 directly occurs. Thus it is necessary to determine the position of Offset2 relative to the ancestor of P1 that is at the same level (that is a child of Node2). To do this, compare Offset2 to the first of the further components for Node1. If Offset2 is greater than or equal, then P2 follows P1 in document order; otherwise it precedes P1.
  7. If a further component(s) is available for only Node2, compare Offset1 to the first of the further components for Node2. If Offset1 is greater than or equal, then P1 follows P2 in document order; otherwise it precedes P2.

Note:

As of this writing, the ordering algorithm in DOM 2: Range[DOM2] appears to produce incorrect results for some cases. Until a correction to this problem is issued, implementors should be particularly careful that their implementations for document order comparison produce the results defined here.

Two ranges R1 and R2 are ordered using the following sequence of comparisons of their starting and ending points (these definitions avoid circularity because the definition of point ordering given above does not involve casting the compared points to their equivalent collapsed ranges):

If the start point of R1 is not equal to the start point of R2, then the ranges are ordered as their respective start points are ordered. That is, if R1 starts before R2 it is before R2, and if R1 starts after R2 it is after R2.

If the start point of R1 is equal to the start point of R2, then the ranges are ordered as their respective end points are ordered. That is, if R1 and R2 start at the same point, then: if R1 ends before R2 it is before R2; if R1 and R2 end at the same point R1 is equal to R2; and if R1 ends after R2 it is after R2.

Thus, R1 and R2 are equal if and only if their respective start points and end points are equal.

Comparisons between any two types of locations (point vs. node, etc.) must produce the same results as obtained through converting all comparands to their equivalent ranges and comparing those ranges.

This algorithm yields the correct results; however, implementations need not use this specific algorithm to compare point locations. They may use any algorithm that produces the same results.

4.5 Functions Added by the xpointer() Scheme

The xpointer() scheme adds the following functions to those in XPath.

4.5.1 range-to Function

location-set range-to(location-set)

For each location in the context, range-to returns a range. The start point of the range is the start point of the context location (as determined by the start-point function), and the end point of the range is the end point (as determined by the end-point function) of the location found by evaluating the expression argument with respect to that context location.

The change made to the XPath syntax to support the range-to construct corresponds to a single addition to the Step production of the [XPath] specification. The original production is as follows:

[4] Step ::= AxisSpecifier NodeTest Predicate* | AbbreviatedStep

The version in the xpointer() scheme is as follows:

[4xptr] Step ::= AxisSpecifier NodeTest Predicate* | AbbreviatedStep | 'range-to' '(' Expr ')' Predicate*

This change is a single exception for the range-to function. It is not a generic change and is not extensible to other functions. The modified production expresses that a range computation must be made for each of the locations in the current location list.

As an example of using the range-to function, the following pointer part locates the range from the start point of the element with ID "chap1" to the end point of the element with ID "chap2".

xpointer(id("chap1")/range-to(id("chap2")))

As another example, imagine a document that uses empty elements (such as <REVST/> for revision start and <REVEND/> for revision end) to mark the boundaries of edits. The following pointer part would select, for each revision, a range starting at the beginning of the REVST element and ending at the end of the next REVEND element:

xpointer(descendant::REVST/range-to(following::REVEND[1]))

4.5.2 string-range Function

location-set string-range(location-set, string, number?, number?)

For each location in the location-set argument,string-range returns a set of ranges determined by searching thestring-valueof the location for substrings that match the stringargument. An empty string is defined to match before each character of the string-value and after the final character. White space in a string is matched literally, with no normalization except that provided by XML for line ends and attribute values. Each non-overlapping match can contribute a range to the resulting location set.

The third argument gives the position of the first character to be in the resulting range, relative to the start of the match. The default value is 1, which makes the range start immediately before the first character of the matched sub-string. The fourth argument gives the number of characters in the range; the default is that the range extends to the end of the matched string. Thus, both the start point and end point of each range returned by the string-range function will be within text nodes.

Element boundaries, as well as entire embedded nodes such as processing instructions and comments, are ignored as specified by the definition of string-value in [XPath].

For any particular location, if the string argument is not found in the string-value of the location, or if the third and fourth argument indicates a range that is wholly beyond the beginning or end of the document or entity, then no range is added to the result for that match.

The start and end points of the range-locations in the returned location-set will all be character-points.

For example, the following expression returns a range that selects the 17th of those "Thomas Pynchon" strings appearing in a title element:

string-range(//title,"Thomas Pynchon")[17]

As another example, the following expression returns a collapsed range whose points immediately precede the letter "P" (8 from the start of the string) in the third of those "Thomas Pynchon" strings appearing in a P element:

string-range(//P,"Thomas Pynchon",8,0)[3]

Alternatively this could be specified as follows:

string-range(string-range(//P,"Thomas Pynchon")[3],"P",1,0)

String-values are "views" into only the string content of a document or entity; they do not retain the structural context of any non-text nodes interspersed with the text. Because the string-range function operates on a string-value, markup that intervenes in the middle of a string does not prevent a match. (Note that for this reason, a string-range match is a range describing the relevant substring of the string-value, not necessarily a contiguous string in a single text node in the document.) For example, if the 17th occurrence of "Thomas Pynchon" had some inline markup in it as follows, it would not change the string identified by the XPointer processor:

The following expression selects the fifth of those exclamation marks appearing in any text node in the document and the character immediately following that exclamation mark:

string-range(/,"!",1,2)[5]

Although these examples locate ranges via text in the string-values of elements, string-range is useful for locating ranges that are wholly enclosed in other node types as well, such as attributes, processing instructions, and comments.

The following functions are related to ranges.

4.5.3.1 covering-range Function

location-set range(location-set)

The covering-range function returns the covering ranges for the locations in the argument location-set. For each location x in the argument location-set, a range location representing thecovering range of x is added to the result location-set.

4.5.3.2 range-inside Function

location-set range-inside(location-set)

The range-inside function returns the distinct covering ranges of the locations in the argument location-set. For each location x in the argument location-set, a location is added to the result location-set. If x is a range location or a point, then x is added to the result location-set. Otherwise x is used as the container node of the start and end points of the range location to be added, which is defined in this way: The index of the start point of the range is zero. If the end point is a character-point then its index is the length of thestring-value of x; otherwise its index is the number of children of x.

4.5.3.3 start-point Function

location-set start-point(location-set)

For each location x in the argument location-set, start-point adds a location of type point to the resulting location-set. That point represents the start point of location x and is determined by the following rules:

4.5.3.4 end-point Function

location-set end-point(location-set)

For each location x in the argument location-set, end-point adds a location of type point to the result location-set. That point represents the end point of location x and is determined by the following rules:

4.5.4 here Function

location-set here()

The here function is meaningful only when the expression being interpreted occurs in an XML document or external parsed entity; otherwise the pointer part in which the here function appears fails. When in an XML context, the here function returns a location-set with a single member. There are two possibilities for the location returned:

In the following example, the here function appears inside an expression that is in an attribute node. The expression as a whole, then, returns the slide element just preceding the slide element that most directly contains the attribute node in question.

Previous

Note:

The type of the node in which the here function appears is likely to be text, attribute, or processing-instruction. The returned location for an expression appearing in element content does not have a node type of element because the expression is in a text node that is itself inside an element.

4.5.5 origin Function

location-set origin()

The origin() function is meaningful only when the expression is being processed in response to traversal of a link expressed in an XML document. The origin function enables addressing relative to third-party and inbound links such as defined in the XLink Recommendation. This allows expressions to express relative locations when links do not reside directly at one of their endpoints. The function returns a location-set with a single member, which locates the element from which a user or program initiated traversal of the link. (See [XLink] for information about traversal.)

It is an error to use origin in the fragment identifier portion of a URI reference where a URI is also provided and identifies a resource different from the resource from which traversal was initiated, or in a situation where traversal is not occurring.

4.6 Root Node Children

The XML Recommendation requires well-formed documents to contain a single element at the top level. Thus, the XPath data model of a well-formed document will have a root node with a single child node of type element. In order to address locations in arbitrary external parsed entities, along with well-formed documents, the xpointer() scheme extends the XPath data model to allow the root node to have any sequence of nodes as children that would be possible of an element node. This extension is identical to the one made by XSLT. Thus, the root node may contain child nodes of type text, and any number of child nodes of type element.