The logo is a pig framed between two angle brackets.

XProc 3.0 Tutorial

Steps for Basic XML Manipulations

When processing XML, you usually want to make small changes to the data, for example insert, rename or delete elements and attributes. XProc offers various steps for these tasks that allow you to manipulate XML without having to write an XSLT. Here is a selection of steps to start with:

<p:insert/>

p:insert takes a document and inserts it into another document. For this purpose, the step features two input ports: An insertion port for the document to be inserted and a source port into which the insert occurs. The match option expects a selection pattern that selects the XML context where the document should be inserted. The position option controls where the insertion is being made in context of the match expression and accepts either one of the values: first-child, last-child, before and after. In this arbitrary example, we insert a paragraph after the title:

<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0">
  
  <p:input port="source">
    <p:inline>
      <doc>
        <title>Poem about XML</title>
      </doc>
    </p:inline>
  </p:input>
  
  <p:output port="result"/>
  
  <p:insert match="/doc/title" position="after">
    <p:with-input port="insertion">
      <p:inline>
        <para>A node is a node is a node.</para>
      </p:inline>
    </p:with-input>
  </p:insert>
  
</p:declare-step>

Output

<?xml version="1.0" encoding="UTF-8"?>
<doc>
  <title>Poem on XML</title>
  <para>A node is a node is a node.</para>
</doc>

<p:delete/>

With p:delete you can delete nodes in an XML document. In the following example, we delete all empty @name attributes in the input document. The selection pattern //plant[not(@name)] is passed via the match option.

<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0">

  <p:input port="source">
    <p:inline>
      <office-plants>
        <plant name="Saintpaulia ionantha">African Violet</plant>
        <plant name="Dracaena">Snake Plant</plant>
        <plant>Cactus</plant>
      </office-plants>
    </p:inline>
  </p:input>

  <p:output port="result"/>

  <p:delete match="//plant[not(@name)]"/>

</p:declare-step>

Output

<?xml version="1.0" encoding="UTF-8"?>
<office-plants>
  <plant name="Saintpaulia ionantha">African Violet</plant>
  <plant name="Dracaena">Snake Plant</plant>
</office-plants>

<p:rename/>

For renaming elements, attributes, or processing instructions, XProc provides the p:rename step. Each node matched by the pattern specified in the match option is renamed to the name in the new-name option. We are using p:rename here to rename the element name foo into bar.

<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0">

  <p:input port="source">
    <p:inline>
      <foo>My element</foo>
    </p:inline>
  </p:input>

  <p:output port="result"/>

  <p:rename match="foo" new-name="bar"/>

</p:declare-step>

Output

<?xml version="1.0" encoding="UTF-8"?>
<bar>My element</bar>

<p:replace/>

The p:replace step replaces matching nodes with the top-level node(s) of another document. Therefore, the step has a replacement port that includes the document which replaces the matched nodes of the document arriving at the source port. The pipeline below shows how p:replace works:

<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0">
  
  <p:input port="source">
    <p:inline>
      <foo>My element</foo>
    </p:inline>
  </p:input>
  
  <p:output port="result"/>
  
  <p:replace match="foo">
    <p:with-input port="replacement">
      <p:inline>
        <bar>My replacement</bar>
      </p:inline>
    </p:with-input>
  </p:replace>
  
</p:declare-step>

Output

<?xml version="1.0" encoding="UTF-8"?>
<bar>My replacement</bar>

<p:wrap/>

With p:wrap you can wrap matching nodes with a new parent element just like wrapping a box with wrapping paper. The nodes to be matched are passed with the match option and the wrapper element is specified by the wrapper option.

<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0">
  
  <p:input port="source">
    <p:inline>
      <doc>
        <para>I am a very strong believer in 
          listening and learning from others.</para>
        <source>Ruth Bader Ginsburg</source>
      </doc>
    </p:inline>
  </p:input>
  
  <p:output port="result"/>
  
  <p:wrap match="/doc/para" wrapper="quote"/>

</p:declare-step>

Output

<?xml version="1.0" encoding="UTF-8"?>
<doc>
  <quote>
    <para>I am a very strong believer in 
      listening and learning from others.</para>
  </quote>
  <source>Ruth Bader Ginsburg</source>
</doc>

<p:unwrap/>

This step provides the reverse operation and unwraps nodes from their parent element specified with the match option:

<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0">
  
  <p:input port="source">
    <p:inline>
      <doc>
        <quote>
          <para>I am a very strong believer in 
          listening and learning from others.</para>
        </quote>
        <source>Ruth Bader Ginsburg</source>
      </doc>
    </p:inline>
  </p:input>
  
  <p:output port="result"/>
  
  <p:unwrap match="quote"/>

</p:declare-step>

Output

<?xml version="1.0" encoding="UTF-8"?>
<doc>
  <para>I am a very strong believer in
    listening and learning from others.</para>  
  <source>Ruth Bader Ginsburg</source>
</doc>

<p:add-attribute/>

<p:add-attribute/> creates attributes on matching nodes and provides the result on the output port. We used this step in previous lessons frequently, so let’s take a slightly more complex example. Here we want to add the matching price for each product. Product and price are connected via their @id and @ref attribute and we use p:viewport to iterate over all products.

<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0" 
  name="my-pipeline">

  <p:input port="source" primary="true">
    <p:inline>
      <product-catalogue>
        <product id="prc-01">High Pressure Water Broom</product>
        <product id="prc-02">Electric Pasta Maker</product>
        <product id="prc-03">Retractable Telescoping Stool</product>
      </product-catalogue>
    </p:inline>
  </p:input>
  
  <p:input port="prices" primary="false">
    <p:inline>
      <prices>
        <price ref="prc-01">120.43</price>
        <price ref="prc-02">40.87</price>
        <price ref="prc-03">61.23</price>
      </prices>
    </p:inline>
  </p:input>

  <p:output port="result"/>
  
  <p:viewport match="product">
    <p:variable name="product-id" select="product/@id"/>
    
    <p:add-attribute match="product" attribute-name="price">
      <p:with-option name="attribute-value" select="//price[@ref eq $product-id]" pipe="prices@my-pipeline"/>
    </p:add-attribute>
    
  </p:viewport>

</p:declare-step>

Output

<?xml version="1.0" encoding="UTF-8"?>
<product-catalogue>
  <product price="120.43" id="prc-01">High Pressure Water Broom</product>
  <product price="40.87" id="prc-02">Electric Pasta Maker</product>
  <product price="61.23" id="prc-03">Retractable Telescoping Stool</product>
</product-catalogue>

Read more…