Using and Defining Steps
If you were to think of an XProc pipeline as a bicycle chain, steps would be the chain links. Steps are chained together via their port bindings, and just as one link in a chain passes force to the next link, a step passes data to the next step. However, XProc offers a wide variety of steps that serve different purposes like manipulating data, grouping other steps or handle errors.
The Type
XProc steps are distinguished from each other by their type. A step is referenced by its type similar to a function call in other programming languages. For example, p:load
loads data, p:add-attribute
attaches attributes to XML elements and p:identity
does nothing except passing its input to its output. XProc’s default steps are all bound to the namespace http://www.w3.org/ns/xproc
and use the namespace prefix p
by convention.
In addition to XProc’s standard steps, you can define custom steps and assign them a specific type. The type must always be an XML qualified name in the format namespace-prefix:localname
. You can define any namespace and prefix, as long as they don’t match those of XProc. To assign a type to a pipeline, add the type
attribute to p:declare-step
. In the example below, the type my:step
was assigned which is bound to the namespace my-namespace
and the prefix my
.
<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
xmlns:my="my-namespace"
version="3.0"
type="my:step">
<p:input port="source"/>
<p:output port="result"/>
<p:identity/>
</p:declare-step>
After we have defined our own step, we can import and call it in another pipeline. The pipeline below imports the step stored with the file name my-step.xpl
via p:import
. Later, the step is referenced by its type my:step
.
<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
xmlns:my="my-namespace"
version="3.0">
<p:import href="my-step.xpl"/>
<p:input port="source">
<p:inline>
<doc>Hello XPorc!</doc>
</p:inline>
</p:input>
<p:output port="result"/>
<my:step/>
</p:declare-step>
Output
<?xml version="1.0" encoding="UTF-8"?> <doc xmlns:my="my-namespace">Hello XPorc!</doc>
Options
Options are name-value pairs that can be used to pass instructions to steps. There are options that you must set and options that are not obligatory to use.
For example, let’s take a closer look at the options of the p:add-attribute
step. You must always set attribute-name
and attribute-value
. The match
attribute is used to specify the location where the attribute is inserted. We can omit the attribute because it’s optional and defaults to "/*"
, e.g. an XSLT selection pattern that matches the root element. If an option has no default value, it is considered as an required option you must set.
<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0">
<p:input port="source" primary="true">
<p:inline>
<doc>Hello XPorc!</doc>
</p:inline>
</p:input>
<p:output port="result"/>
<p:add-attribute attribute-name="my-att"
attribute-value="my-value"/>
</p:declare-step>
Output
<?xml version="1.0" encoding="UTF-8"?> <doc my-att="my-value">Hello XPorc!</doc>
Ports
As mentioned earlier, Steps contain input and output ports to which they are connected to the pipeline or other steps. Steps may have primary and secondary input and/or output ports. These ports are identified by a name and can either allow one document or a sequence of zero or more documents. Additionally, ports can specify the MIME type of documents the step can accept or generate.
In the example below, the input and output ports of the built-in XProc standard step p:insert
are specified with pseudo code. There is a primary port named source
that accepts one document followed by a secondary input port entitled insertion
that can handle zero or more documents. The result
output port produces one document. All ports can handle XML and HTML content.
<p:declare-step type="p:insert">
<p:input port="source" primary="true"
content-types="xml html"/>
<p:input port="insertion" sequence="true"
content-types="xml html"/>
<p:output port="result" content-types="xml html"/>
</p:declare-step>
Please note that you can always lookup step definitions in the XProc 3.0 spec.