Steps to Manage Sequences
There are several steps in XProc which accept and/or produce a sequence of documents. However, there are some steps that are specifically designed for this purpose:
<p:wrap-sequence/>
As discussed earlier, some XProc steps like p:for-each
can produce a sequence of documents. Sometimes it is necessary to make a single document out of a sequence. For this purpose, you can use p:wrap-sequence
.
Let’s say we have a directory in which we store the individual chapters of a book as XML documents. We want to load all the chapters in this directory and combine them into a single book
element. First we list the contents of the directory with p:directory-list
. Then we iterate over this list with p:for-each
and use p:load
to get the contents of each chapter. Finally, we take this sequence of documents and create a single XML document with p:wrap-sequence
.
<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
xmlns:c="http://www.w3.org/ns/xproc-step"
version="3.0">
<p:output port="result"/>
<p:option name="chapter-dir" select="'my-dir'"/>
<p:directory-list path="{$chapter-dir}"/>
<p:for-each>
<p:with-input select="//c:file"/>
<p:load href="{$chapter-dir || '/' || c:file/@name}"/>
</p:for-each>
<p:wrap-sequence wrapper="book"/>
</p:declare-step>
Output
<?xml version="1.0" encoding="UTF-8"?> <book> <chapter> <title>Chapter I</title> <para>Here is some text.</para> </chapter> <chapter> <title>Chapter II</title> <para>Another part of text.</para> </chapter> <chapter> <title>Chapter III</title> <para>This is actually text.</para> </chapter> </book>
<p:split-sequence/>
Another useful step to manage sequences is p:split-sequence
. This step splits a document sequence into two sequences based on an XPath expression passed through the test
attribute. The two sequences are provided by the output matched
and not-matched
output port. With the initial-only
option, you can decide whether all other documents are passed to the output port after the first document was successfully tested.
One use case for this step could be to split a sequence of XML documents based on their validation. Let’s assume the validation result is expressed with the valid
attribute, we could utilize the attribute value to separate valid and invalid documents and redirect them to different output ports.
<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0">
<p:input port="source" sequence="true">
<p:inline>
<doc valid="false"/>
</p:inline>
<p:inline>
<doc valid="false"/>
</p:inline>
<p:inline>
<doc valid="true"/>
</p:inline>
</p:input>
<p:output port="result" primary="true" sequence="true"/>
<p:output port="failed" primary="false" sequence="true"
pipe="not-matched@split-seq"/>
<p:split-sequence name="split-seq" test="doc/@valid = 'true'"/>
</p:declare-step>
Output: Result
<?xml version="1.0" encoding="UTF-8"?> <doc valid="true"/>
Output: Failed
<?xml version="1.0" encoding="UTF-8"?> <doc valid="false"/> <?xml version="1.0" encoding="UTF-8"?> <doc valid="false"/>
<p:pack/>
p:pack
merges two sequences of documents pairwise. The step processes each pair of documents sequentially, one from the source
port and one from the alternate
port, encapsulates them in a new element specified by the wrapper
option, and outputs the resulting element to the result
port as a document.
Let me illustrate this with a simple example. Consider we want to associate a photo collection with captions. We pass the file references to the source
port and the texts to the captions
port. p:pack
is used to include both sequences and wrap a pair of documents from both sequences in a new image
element. Please note that we did not declare the connection between the pipeline’s source port and p:pack
’s source port because these are primary input ports and implicitely connected.
<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0"
name="my-pipeline">
<p:input port="source" primary="true" sequence="true">
<file href="img_hoover-dam-23023.jpg"/>
<file href="img_kosciuszko-03121.jpg"/>
</p:input>
<p:input port="captions" primary="false" sequence="true">
<caption>Justin the beaver visiting the Hoover
Dam during his vacation in the USA.</caption>
<caption>Arnold the termite and his colleagues
make a selfie before Kościuszko Mound in Poland.</caption>
</p:input>
<p:output port="result" sequence="true"/>
<p:pack wrapper="image">
<p:with-input port="alternate" pipe="captions@my-pipeline"/>
</p:pack>
</p:declare-step>
Output
<?xml version="1.0" encoding="UTF-8"?> <image> <file href="img_hoover-dam-23023.jpg"/> <caption>Justin the beaver visiting the Hoover Dam during his vacation in the USA.</caption> </image> <?xml version="1.0" encoding="UTF-8"?> <image> <file href="img_kosciuszko-03121.jpg"/> <caption>Arnold the termite and his colleagues make a selfie before Kościuszko Mound in Poland.</caption> </image>