The logo is a pig framed between two angle brackets.

Selecting Nodes With XPath Expressions

A Brief Introduction to XPath

XPath is an expression language to query and transform XML documents. It is part of XML technologies like XSLT, XQuery and Schematron and XPath is used in XProc as well. If you have nothing heard of XPath before, then I would recommend to read a tutorial first.

In the XML document below is a small music collection.

<?xml version="1.0" encoding="UTF-8"?>
<music>
  <artist name="The Clash">
    <album title="The Clash"/>
    <album title="Give 'Em Enough Rope"/>
    <album title="London Calling"/>
  </artist>
  <artist name="Dead Kennedys">
    <album title="Fresh Fruit for Rotting Vegetables"/>
  </artist>
  <artist name="U2"/>
</music>

We could query the document with an XPath that selects the third album of the artist “The Clash” in the music collection:

/music/artist[@name = 'The Clash']/album[3]

Output

<album title="London Calling"/>

As you can see, the path to the XML element in question is separated by forward slashes (/). Conditions that filter the results are written in square brackets. Attribute names prefixed with an at sign (@). If we want to state the position of a node, we simply add its position in square brackets. Please be aware that in contrast to other programming languages, we start to count in XProc and XPath at 1 and not 0.

XPath Expressions and XSLT Selection Patterns

In XProc you can use XPath within the options select, test and match. select and test expect an XPath Expression while match requires an XSLT Selection Pattern. The difference is that XPath Expressions return selected nodes while XSLT Selection Patterns are just used for testing nodes and are limited to return either true or false.

In XProc you can use the step p:filter to sort out all elements that are not matched by the XPath. The select option expects an XPath expression that provides the address of the nodes to be filtered.

<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0">
  
  <p:input port="source"/>
  
  <p:output port="result"/>
  
  <p:filter select="/music/artist[@name = 'The Clash']/album[3]"/>
  
</p:declare-step>

Obviously, since XML and XPath are the same, this pipeline yields the same result as the XPath above:

Output

<album title="London Calling"/>

This template below shows how an element is matched by to be deleted. The match attribute contains the XSLT Selection Pattern:

<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0">
  
  <p:input port="source"/>
  
  <p:output port="result"/>
  
  <p:delete match="artist[@name = 'U2']"/>
  
</p:declare-step>

Output

<?xml version="1.0" encoding="UTF-8"?>
<music>
  <artist name="The Clash">
    <album title="The Clash"/>
    <album title="Give 'Em Enough Rope"/>
    <album title="London Calling"/>
  </artist>
  <artist name="Dead Kennedys">
    <album title="Fresh Fruit for Rotting Vegetables"/>
  </artist>
</music>

Static Values and XPath expressions

Many options are usually treated not as XPath but as static values in XProc. For example, if we define an option that represents the music genre and try to add it as attribute value, the option value will simply be passed to the output unchanged:

<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0">
  
  <p:input port="source"/>
  
  <p:output port="result"/>
  
  <p:option name="genre-value" select="'punk'"/>
  
  <p:add-attribute match="album" attribute-name="genre" 
                   attribute-value="$genre-value"/>
  
</p:declare-step>

Output

<?xml version="1.0" encoding="UTF-8"?>
<music>
  <artist name="The Clash">
    <album genre="$genre-value" title="The Clash"/>
    <album genre="$genre-value" title="Give 'Em Enough Rope"/>
    <album genre="$genre-value" title="London Calling"/>
  </artist>
  <artist name="Dead Kennedys">
    <album genre="$genre-value" title="Fresh Fruit for Rotting Vegetables"/>
  </artist>
  <artist name="U2"/>
</music>

To evaluate the option value not as static value but as XPath expression we need to rewrite our step with p:with-option:

<p:add-attribute match="artist" attribute-name="genre">
  <p:with-option name="attribute-value" select="$genre-value"/>
</p:add-attribute>

Output

<?xml version="1.0" encoding="UTF-8"?>
<music>
  <artist genre="punk" name="The Clash">
    <album title="The Clash"/>
    <album title="Give 'Em Enough Rope"/>
    <album title="London Calling"/>
  </artist>
  <artist genre="punk" name="Dead Kennedys">
    <album title="Fresh Fruit for Rotting Vegetables"/>
  </artist>
  <artist genre="punk" name="U2"/>
</music>

Read more…