Using and Defining Steps

If you were to think of an XProc pipeline as a bicycle chain, steps would be the chain links. Steps are chained together via their port bindings, and just as one link in a chain passes force to the next link, a step passes data to the next step. However, XProc offers a wide variety of steps that serve different purposes like manipulating data, grouping other steps or handle errors.

A bicycle chain

The Type

XProc steps are distinguished from each other by their type. A step is referenced by its type similar to a function call in other programming languages. For example, p:load loads data, p:add-attribute attaches attributes to XML elements and p:identity does nothing except passing its input to its output. XProc’s default steps are all bound to the namespace http://www.w3.org/ns/xproc and use the namespace prefix p by convention.

In addition to XProc’s standard steps, you can define custom steps and assign them a specific type. The type must always be an XML qualified name in the format namespace-prefix:localname. You can define any namespace and prefix, as long as they don’t match those of XProc. To assign a type to a pipeline, add the type attribute to p:declare-step. In the example below, the type my:step was assigned which is bound to the namespace my-namespace and the prefix my.

<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
  xmlns:my="my-namespace"
  version="3.0" 
  type="my:step">
  
  <p:input port="source"/>
  
  <p:output port="result"/>
  
  <p:identity/>
  
</p:declare-step>

After we have defined our own step, we can import and call it in another pipeline. The pipeline below imports the step stored with the file name my-step.xpl via p:import. Later, the step is referenced by its type my:step.

<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
  xmlns:my="my-namespace"
  version="3.0">
  
  <p:import href="my-step.xpl"/>
  
  <p:input port="source">
    <p:inline>
      <doc>Hello XPorc!</doc>
    </p:inline>
  </p:input>
  
  <p:output port="result"/>
  
  <my:step/>
  
</p:declare-step>

Output

<?xml version="1.0" encoding="UTF-8"?>
<doc xmlns:my="my-namespace">Hello XPorc!</doc>

Options

Options are name-value pairs that can be used to pass instructions to steps. There are options that you must set and options that are not obligatory to use.

For example, let’s take a closer look at the options of the p:add-attribute step. You must always set attribute-name and attribute-value. The match attribute is used to specify the location where the attribute is inserted. We can omit the attribute because it’s optional and defaults to "/*", e.g. an XSLT selection pattern that matches the root element. If an option has no default value, it is considered as an required option you must set.

<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0">
    
  <p:input port="source" primary="true">
    <p:inline>
      <doc>Hello XPorc!</doc>
    </p:inline>
  </p:input>
  
  <p:output port="result"/>
  
  <p:add-attribute attribute-name="my-att" 
                   attribute-value="my-value"/>
  
</p:declare-step>

Output

<?xml version="1.0" encoding="UTF-8"?>
<doc my-att="my-value">Hello XPorc!</doc>

Ports

As mentioned earlier, Steps contain input and output ports to which they are connected to the pipeline or other steps. Steps may have primary and secondary input and/or output ports. These ports are identified by a name and can either allow one document or a sequence of zero or more documents. Additionally, ports can specify the MIME type of documents the step can accept or generate.

In the example below, the input and output ports of the built-in XProc standard step p:insert are specified with pseudo code. There is a primary port named source that accepts one document followed by a secondary input port entitled insertion that can handle zero or more documents. The result output port produces one document. All ports can handle XML and HTML content.

<p:declare-step type="p:insert">
  <p:input port="source" primary="true" 
           content-types="xml html"/>
  <p:input port="insertion" sequence="true" 
           content-types="xml html"/>
  <p:output port="result" content-types="xml html"/>
</p:declare-step>

Please note that you can always lookup step definitions in the XProc 3.0 spec.

Using and Defining Steps

The Type

Options

Ports

Read more…