Sorting in XSLT (original) (raw)

Sorting in XSLT

July 3, 2002

Bob DuCharme

XSLT's xsl:sort instruction lets you sort a group of similar elements. Attributes for this element let you add details about how you want the sort done -- for example, you can sort using alphabetic or numeric ordering, sort on multiple keys, and reverse the sort order.

To demonstrate different ways to sort, we'll use the following document.

Hill Phil 100000 Herbert Johnny 95000 Hill Graham 89000

(All stylesheets, input documents, and output documents shown in this article are in this zip file.) This first stylesheet sorts theemployee children of the employees element by salary.

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:output method="text"/>

<xsl:template match="employees"> xsl:apply-templates <xsl:sort select="salary"/>

<xsl:template match="employee"> Last: <xsl:apply-templates select="last"/> First: <xsl:apply-templates select="first"/> Salary: <xsl:apply-templates select="salary"/> Hire Date: <xsl:apply-templates select="@hireDate"/> xsl:text

It's pretty simple. The employees element's template has anxsl:apply-templates instruction with an xsl:sort child to tell the XSLT processor to sort the employees element's child elements. The xsl:sort instruction's select attribute specifies the sort key to use: the employee elements' salary values. (If you omit the select attribute, the XSLT processor uses a string version of the elements to be sorted as a sort key.) Theemployee element's template rule adds each of its child node's values to the result tree preceded by a label, and a final xsl:text element adds a carriage return after each hire date value.

Note Most xsl:apply-templates elements you see in XSLT stylesheets are empty. When you sort an element's children, the xsl:sort element goes between the start- and end-tags of the xsl:apply-templates element that tells the XSLT processor to process these children. The only other place you can put an xsl:sort instruction is inside of the xsl:for-each instruction used to iterate across a node set, as we'll see below.

With the document above, this stylesheet gives us this output:

Last:      Hill
First:     Phil
Salary:    100000
Hire Date: 04/23/1999

Last:      Hill
First:     Graham
Salary:    89000
Hire Date: 08/20/2000

Last:      Herbert
First:     Johnny
Salary:    95000
Hire Date: 09/01/1998

The employees are sorted by salary, but they're sorted alphabetically -- "1" comes before "8" and "9", so a salary of "100000" comes first. But we want the salary values treated as numbers, so we make a simple addition to the template's xsl:sort instruction:

<xsl:template match="employees"> xsl:apply-templates <xsl:sort select="salary" **data-type="number"**/>

Now, the output is sorted by the salary element's numeric value:

Last: Hill First: Graham Salary: 89000 Hire Date: 08/20/2000

Last: Herbert First: Johnny Salary: 95000 Hire Date: 09/01/1998

Last: Hill First: Phil Salary: 100000 Hire Date: 04/23/1999

To reverse the order of this or any other sort, add an order attribute with a value of "descending":

<xsl:template match="employees"> xsl:apply-templates <xsl:sort select="salary" data-type="number" **order="descending"**/>

Whether the data-type attribute has a value of "number" like the stylesheet above or "text" (the default), an order value of "descending" reverses the order of the sort:

Last: Hill First: Phil Salary: 100000 Hire Date: 04/23/1999

Last: Herbert First: Johnny Salary: 95000 Hire Date: 09/01/1998

Last: Hill First: Graham Salary: 89000 Hire Date: 08/20/2000

If your xsl:apply-templates (or xsl:for-each) element has more than onexsl:sort instruction inside of it, the XSLT processor treats them as multiple keys to the sort. For example, the stylesheet with this next template sorts the employees by last name and then by first name so that any employees with the same last name will be in first name order.

<xsl:template match="employees"> xsl:apply-templates <xsl:sort select="last"/> <xsl:sort select="first"/>

When applied to the document above, the result shows Johnny Herbert before Phil and Graham Hill, and the secondary sort puts Graham Hill before Phil Hill:

Last: Herbert First: Johnny Salary: 95000 Hire Date: 09/01/1998

Last: Hill First: Graham Salary: 89000 Hire Date: 08/20/2000

Last: Hill First: Phil Salary: 100000 Hire Date: 04/23/1999

The sort key doesn't need to be an element child of the sorted elements. Thexsl:sort instruction's select attribute can take any XPath expression as a sort key. For example, the following version sorts the employees by theirhireDate attribute values:

<xsl:template match="employees"> xsl:apply-templates <xsl:sort **select="@hireDate"**/>

Treating the dates as strings doesn't do much good, because they're sorted alphabetically,

Last: Hill First: Phil Salary: 100000 Hire Date: 04/23/1999

Last: Hill First: Graham Salary: 89000 Hire Date: 08/20/2000

Last: Herbert First: Johnny Salary: 95000 Hire Date: 09/01/1998

but it's easy enough to have three sort keys based on the year, month, and day substrings of the date string:

<xsl:template match="employees"> xsl:apply-templates <xsl:sort select="substring(@hireDate,7,4)"/> <xsl:sort select="substring(@hireDate,1,2)"/> <xsl:sort select="substring(@hireDate,3,2)"/>

This stylesheet sorts the dates properly. (An important feature of XSLT 2.0 -- and, some say, the one that's going to slow its progress toward Recommendation status the most -- is the ability to handle typed data. Once in place, you'll be able to just say "this attribute is a date, so sort it that way.")

Last: Herbert First: Johnny Salary: 95000 Hire Date: 09/01/1998

Last: Hill First: Phil Salary: 100000 Hire Date: 04/23/1999

Last: Hill First: Graham Salary: 89000 Hire Date: 08/20/2000

All the examples so far have sorted the children (the employee elements) of an element (employees) using one or more child nodes of those children (thesalary, first, and last elements or the hireDate attribute) as sort keys. The previous example's use of the hireDate attribute showed that the expression used as the xsl:sort element's select attribute doesn't have to be a child element name, but can be an attribute name instead, or even a value returned by a function.

Your sort key can be an even more complex XPath expression. For example, the next stylesheet sorts the wine elements in this document's winelist element, but not by a child of the wine element; it sorts the wine elements by a grandchild of the wine elements: the prices child's discounted element.

Lindeman's Bin 65 1998 6.99 5.99 71.50 Benziger Carneros 1997 10.99 9.50 114.00 Duckpond Merit Selection 1996 13.99 11.99 143.50 Kendall Jackson Vintner's Reserve 1998 12.50 9.99 115.00

The sort key is only slightly more complicated than those shown in the earlier examples. It's an XPath expression saying "the discounted child of the prices element".

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:template match="winelist"> xsl:copy xsl:apply-templates <xsl:sort data-type="number" select="prices/discounted"/>

<xsl:template match="*"> xsl:copy xsl:apply-templates/

The entire stylesheet is not very big. It just copies thewine elements, sorted according to the sort key:

Lindeman's Bin 65 1998 6.99 5.99 71.50 Benziger Carneros 1997 10.99 9.50 114.00 Kendall Jackson Vintner's Reserve 1998 12.50 9.99 115.00 Duckpond Merit Selection 1996 13.99 11.99 143.50

Let's look at how the xsl:for-each instruction can use xsl:sort. The following stylesheet takes the same winelist document above and lists the wines. When it gets to a Chardonnay, it lists all the other Chardonnays alphabetically.

"> ]> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output method="xml" omit-xml-declaration="yes" indent="no"/>

<xsl:template match="wine"> <xsl:apply-templates select="winery"/>&space; <xsl:apply-templates select="product"/>&space; <xsl:apply-templates select="year"/>&space; <xsl:apply-templates select="@grape"/> <xsl:if test="@grape = 'Chardonnay'"> xsl:text other Chardonnays: <xsl:for-each** **select="preceding-sibling::wine[@grape = 'Chardonnay'] |** **following-sibling::wine[@grape = 'Chardonnay']"> <xsl:sort select="winery"/> xsl:text <xsl:value-of select="winery"/>&space; <xsl:value-of select="product"/>xsl:text

Before we examine how the stylesheet does this, let's take a look at the result:

Lindeman's Bin 65 1998 Chardonnay other Chardonnays: Benziger Carneros Kendall Jackson Vintner's Reserve

Benziger Carneros 1997 Chardonnay other Chardonnays: Kendall Jackson Vintner's Reserve Lindeman's Bin 65

Duckpond Merit Selection 1996 Cabernet

Kendall Jackson Vintner's Reserve 1998 Chardonnay other Chardonnays: Benziger Carneros Lindeman's Bin 65

(First, notice the "&space;" entity references throughout the stylesheet. Instead of writing "xsl:text " over and over because I needed single spaces in so many places, it was easier to declare an entity named space in the DOCTYPE declaration with this xsl:text element as content and to then plug it in with an entity reference whenever I needed it.) The xsl:template template rule for thewine element has xsl:apply-templates instructions for itswinery, product, and year element children followed by one for its grape attribute. Then, if the grape attribute has a value of "Chardonnay", it adds the text "other Chardonnays:" to the result tree followed by the list of Chardonnays, which are added to the result tree using an xsl:for-each instruction.

The select attribute of the xsl:for-each attribute selects all the nodes that are either preceding siblings of the current node with a grape value of "Chardonnay" or following siblings of the current node with the same grape value. (The "|" symbol is the "or" part.) For each wine element that meets thisselect attribute's condition, the template first adds some white space indenting with an xsl:text element, then the value of the wine element'swinery child, a space, and the value of its product child. The first instruction in this xsl:for-each element is an xsl:sort element, which tells the XSLT processor to sort the nodes selected by the xsl:for-each instruction alphabetically in "winery" order. That's how they look in the result: after the first "other Chardonnays:" label, "Kendall Jackson" comes after "Benziger"; after the second, "Lindeman's" comes after "Kendall Jackson"; and after the last one, "Lindeman's" comes after "Benziger".

Because the xsl:for-each instruction lets you grab and work with any node set that you can describe using an XPath expression, the ability to sort one of these node sets makes it one of XSLT's most powerful features.

Next month, we'll see how xsl:sort can help find the first, last, biggest, and smallest value in a document according to a certain criteria. (If you can't wait, see my book XSLT Quickly, from which these columns are excerpted.)