Rerouting and Interrupting the Data Flow
The data flow in a pipeline does not have to follow a linear order. On the contrary, if steps draw data from other steps in the pipeline, the data flow can branch and merge, be interrupted at some point and continued later. For this purpose, XProc comes with two very simple but frequently used steps: p:identity
and p:sink
.
We used p:identity
before. The purpose of this step is simply to replicate its input as output. In contrast, p:sink
takes is input and discards it. p:sink
is one of the steps without an output port. Thanks to their simple functionality, these steps are perfect for controlling the data flow in a pipeline. Let’s have a look at this example:
<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0">
<p:input port="source">
<p:inline>
<doc>Hello XPorc!</doc>
</p:inline>
</p:input>
<p:output port="result"/>
<p:identity name="identity"/>
<p:sink/>
<p:identity>
<p:with-input pipe="result@identity"/>
</p:identity>
</p:declare-step>
Output
<?xml version="1.0" encoding="UTF-8"?> <doc>Hello XPorc!</doc>
Let’s take a closer look what’s happening:
- The first
p:identity
takes the input and pass it to the output unchanged. The@name
attribute is inserted that we can reference the step’s output later. - The
p:sink
discards the output of ourp:identity
. - After
p:sink
, there is no output but our next step expects an input. To avoid an error, the secondp:identity
draws its input from the firstp:identity
and continues the data flow.
p:identity
and p:sink
are ideal to reroute and stop the data flow. Frequently, you may save the p:identity
if you connect the input port of another step directly with p:with-input
. Nevertheless, you should keep in mind that your XProc code becomes easier to read the less you interrupt the data flow and continue it elsewhere.