Automate documentation with XProc and the GitHub Web API

for the Open Source framework transpect

martin.kraetke@le-tex.de / @mkraetke

Background

code documentation before transpect

The picture which seems from the middled-age shows a monk which writes in a book with blank pages. One page shows the text 'README'.

code documentation before transpect

just a few READMEs, Wiki pages existed
some inline XSLT comments
no common methodology or standards

how the code was organized
before transpect

many repositories in our non-public SVN
code was copied from project to project
several languages for sticking XSLT pipelines together

Towards a modular architecture

Pipelining previously

pipelines sticked together with Make, Ruby, Perl
XSLT micropipelines
depends on preinstalled tools and OS

Pipelining with XProc

declarative vocabulary
port connections
interoperability

Encapsulation and modularity

reusable steps with p:declare-step and p:import
Canonical import URIs
XML Catalogs

Declaring canonical import URIs with XML Catalogs

<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
  
  <rewriteURI uriStartString="http://transpect.io/docx2tex/" 
              rewritePrefix="../"/>
              
  <nextCatalog catalog="../docx2hub/xmlcatalog/catalog.xml"/>
  <nextCatalog catalog="../xproc-util/xmlcatalog/catalog.xml"/>
  <nextCatalog catalog="../xslt-util/xmlcatalog/catalog.xml"/>
  <nextCatalog catalog="../evolve-hub/xmlcatalog/catalog.xml"/>
  <nextCatalog catalog="../mml2tex/xmlcatalog/catalog.xml"/>
  <nextCatalog catalog="../xml2tex/xmlcatalog/catalog.xml"/>
</catalog>

Import and use steps

<p:import href="http://transpect.io/xml2tex/xpl/xml2tex.xpl"/>
              
<xml2tex:convert name="xml2tex">
  <p:input port="conf">
    <p:pipe port="result" step="load-config"/>
  </p:input>
  <p:with-option name="table-model" select="$table-model"/>
  <p:with-option name="table-grid" select="$table-grid"/>
  <p:with-option name="debug" select="$debug"/>
  <p:with-option name="debug-dir-uri" select="$debug-dir-uri"/>
  <p:with-option name="status-dir-uri" select="$status-dir-uri"/>
  <p:with-option name="fail-on-error" select="$fail-on-error"/>
</xml2tex:convert>

We entitled it transpect and published it under a FreeBSD license.

Still just some guides and inline documentation existed, no standards.

Establishing standards

Coding style

more details here

Naming conventions

MyProject/
  |--xmlcatalog/
  |  |--catalog.xml
  |--moduleXY
  |--css/
  |  |--myStylesheet.css
  |--xpl/
  |  |--myPipeline.xpl
  |--xsl/
  |  |--myXSLT.xsl

common directory structure of a transpect module

Moving to GitHub: The good parts

SVN is our favorite version control system, but …

… we wanted to lower the barriers for external users to use our code, file bug reports, make pull requests etc.

The Good, the Bad and GitHub

GitHub downtimes (bad for our CI system)
SVN adapter seems sometimes not to work properly
Git submodules (Detached HEAD)

Implementation

XProc inline documentation

<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
  xmlns:c="http://www.w3.org/ns/xproc-step" version="1.0">
  
  <p:documentation xmlns="http://www.w3.org/1999/xhtml">
    The documentation may include 
    <a href="https://en.wikipedia.org/wiki/HTML">HTML</a> 
    markup as well.
  </p:documentation>
  
  <p:input port="source">
    <p:inline>
      <doc>Hello world!</doc>
    </p:inline>
  </p:input>
  <p:output port="result"/>
  <p:identity/>
  
</p:declare-step>

Accessing the GitHub API

XProc p:http-request and Calabash transparentJSON extension used to access the GitHub API
two pipelines: accessing the organization and recursively list folder and files in a repository
XProc pipelines, XSLT stylesheets and XML catalogs are included as well

transparentJSON representation of
an GitHub Repository

<?xml version="1.0" encoding="UTF-8"?>
  <j:item xmlns:j="http://marklogic.com/json" type="object">
  <j:language type="string">XProc</j:language>
  <j:svn_005furl type="string">https://github.com/transpect/cascade</j:svn_005furl>
  <j:forks type="number">0</j:forks>
  <j:ssh_005furl type="string">git@github.com:transpect/cascade.git</j:ssh_005furl>
  <j:full_005fname type="string">transpect/cascade</j:full_005fname>
  <j:clone_005furl type="string">https://github.com/transpect/cascade.git</j:clone_005furl>
  <j:default_005fbranch type="string">master</j:default_005fbranch>
  <j:private type="boolean">false</j:private>
  <j:open_005fissues_005fcount type="number">0</j:open_005fissues_005fcount>
  <j:description type="string">Libraries to implement a transpect cascade configuration</j:description>
  <j:git_005furl type="string">git://github.com/transpect/cascade.git</j:git_005furl>
  <j:has_005fissues type="boolean">true</j:has_005fissues>
  <j:contents_005furl type="string">https://api.github.com/repos/transpect/cascade/contents/{+path}</j:contents_005furl>  
</j:item>

Generating the documentation XML

analyze XProc documentation tags, input and output ports, options, imports and generate DocBook
XInclude is used to include this in the general DocBook documentation

HTML representation

HTML template and MaterializeCSS framework
DocBook converted to HTML and injected into the template
Output HTML files are stored with XProc

Finally

commit HTML and XML (still manually) to update http://transpect.io

Outlook

writing documentation
writing documentation
writing documentation

Outlook

Jenkins integration
use inline documentation of input and output ports, options
perhaps visualize XProc pipelines

Automate documentation with XProc and the GitHub Web API

for the Open Source framework transpect

ToC

Background

code documentation before transpect

code documentation before transpect

how the code was organized
before transpect

Towards a modular architecture

Pipelining previously

Pipelining with XProc

Encapsulation and modularity

Declaring canonical import URIs with XML Catalogs

Import and use steps

Establishing standards

Coding style

Naming conventions

Moving to GitHub: The good parts

The Good, the Bad and GitHub

Implementation

XProc inline documentation

Accessing the GitHub API

Generating the documentation XML

HTML representation

Finally

Outlook

Outlook

Outlook

Thank you!

Automate documentation with XProc and the GitHub Web API

for the Open Source framework transpect

ToC

Background

code documentation before transpect

code documentation before transpect

how the code was organized before transpect

Towards a modular architecture

Pipelining previously

Pipelining with XProc

Encapsulation and modularity

Declaring canonical import URIs with XML Catalogs

Import and use steps

Establishing standards

Coding style

Naming conventions

Moving to GitHub: The good parts

The Good, the Bad and GitHub

Implementation

XProc inline documentation

Accessing the GitHub API

Generating the documentation XML

HTML representation

Finally

Outlook

Outlook

Outlook

Thank you!

how the code was organized
before transpect