Steps to Manage Sequences

There are several steps in XProc which accept and/or produce a sequence of documents. However, there are some steps that are specifically designed for this purpose:

<p:wrap-sequence/>
<p:split-sequence/>
<p:pack/>

<p:wrap-sequence/>

As discussed earlier, some XProc steps like p:for-each can produce a sequence of documents. Sometimes it is necessary to make a single document out of a sequence. For this purpose, you can use p:wrap-sequence.

Let’s say we have a directory in which we store the individual chapters of a book as XML documents. We want to load all the chapters in this directory and combine them into a single book element. First we list the contents of the directory with p:directory-list. Then we iterate over this list with p:for-each and use p:load to get the contents of each chapter. Finally, we take this sequence of documents and create a single XML document with p:wrap-sequence.

<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" 
  xmlns:c="http://www.w3.org/ns/xproc-step"
  version="3.0">
  
  <p:output port="result"/>
  
  <p:option name="chapter-dir" select="'my-dir'"/>
  
  <p:directory-list path="{$chapter-dir}"/>
  
  <p:for-each>
    <p:with-input select="//c:file"/>
    
    <p:load href="{$chapter-dir || '/' || c:file/@name}"/>
    
  </p:for-each>
  
  <p:wrap-sequence wrapper="book"/>
  
</p:declare-step>

Output

<?xml version="1.0" encoding="UTF-8"?>
<book>
  <chapter>
    <title>Chapter I</title>
    <para>Here is some text.</para>
  </chapter>
  <chapter>
    <title>Chapter II</title>
    <para>Another part of text.</para>
  </chapter>
  <chapter>
    <title>Chapter III</title>
    <para>This is actually text.</para>
  </chapter>
</book>

<p:split-sequence/>

Another useful step to manage sequences is p:split-sequence. This step splits a document sequence into two sequences based on an XPath expression passed through the test attribute. The two sequences are provided by the output matched and not-matched output port. With the initial-only option, you can decide whether all other documents are passed to the output port after the first document was successfully tested.

One use case for this step could be to split a sequence of XML documents based on their validation. Let’s assume the validation result is expressed with the valid attribute, we could utilize the attribute value to separate valid and invalid documents and redirect them to different output ports.

<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0">
  
  <p:input port="source" sequence="true">
    <p:inline>
      <doc valid="false"/>
    </p:inline>
    <p:inline>
      <doc valid="false"/>
    </p:inline>
    <p:inline>
      <doc valid="true"/>
    </p:inline>
  </p:input>
  
  <p:output port="result" primary="true" sequence="true"/>
  
  <p:output port="failed" primary="false" sequence="true" 
            pipe="not-matched@split-seq"/>
  
  <p:split-sequence name="split-seq" test="doc/@valid = 'true'"/>
  
</p:declare-step>

Output: Result

<?xml version="1.0" encoding="UTF-8"?>
<doc valid="true"/>

Output: Failed

<?xml version="1.0" encoding="UTF-8"?>
<doc valid="false"/>

<?xml version="1.0" encoding="UTF-8"?>
<doc valid="false"/>

<p:pack/>

p:pack merges two sequences of documents pairwise. The step processes each pair of documents sequentially, one from the source port and one from the alternate port, encapsulates them in a new element specified by the wrapper option, and outputs the resulting element to the result port as a document.

Let me illustrate this with a simple example. Consider we want to associate a photo collection with captions. We pass the file references to the source port and the texts to the captions port. p:pack is used to include both sequences and wrap a pair of documents from both sequences in a new image element. Please note that we did not declare the connection between the pipeline’s source port and p:pack’s source port because these are primary input ports and implicitely connected.

<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="3.0" 
  name="my-pipeline">
  
  <p:input port="source" primary="true" sequence="true">
    <file href="img_hoover-dam-23023.jpg"/>
    <file href="img_kosciuszko-03121.jpg"/>
  </p:input>
  
  <p:input port="captions" primary="false" sequence="true">
    <caption>Justin the beaver visiting the Hoover 
      Dam during his vacation in the USA.</caption>
    <caption>Arnold the termite and his colleagues 
      make a selfie before Kościuszko Mound in Poland.</caption>
  </p:input>
  
  <p:output port="result" sequence="true"/>
  
  <p:pack wrapper="image">
    <p:with-input port="alternate" pipe="captions@my-pipeline"/>
  </p:pack>
  
</p:declare-step>

Output

<?xml version="1.0" encoding="UTF-8"?>
<image>
  <file href="img_hoover-dam-23023.jpg"/>
  <caption>Justin the beaver visiting the Hoover
    Dam during his vacation in the USA.</caption>
</image>

<?xml version="1.0" encoding="UTF-8"?>
<image>
  <file href="img_kosciuszko-03121.jpg"/>
  <caption>Arnold the termite and his colleagues 
    make a selfie before Kościuszko Mound in Poland.</caption>
</image>

Steps to Manage Sequences

<p:wrap-sequence/>

<p:split-sequence/>

<p:pack/>

Read more…