Documenting python code with reST, docutils, pygments, cog and paver

Producing documentation for python code in a file with reST, docutils, cog and paver

Introduction

Documentation in multiple formats, from documentation source in reStructured Text (reST) plain text markup format, for code in a python project can be produced using the python package Sphinx. Sphinx is especially suited for handling cases where the documentation source spans multiple files. It can be used to produce table of contents, cross-references, global index, module index and even provides search functionality. For python code in one source file, documentation can be usually provided using a single documentation source file. Sphinx can be used in such cases, but most of the features provided by Sphinx are not needed.

The following describe a simple recipe for producing html documentation, from plain text documentation source in a single file written in the reST format, for python code in a single file. These ideas closely follow Doug Hellmann’s descripition of his tool chain for producing documentation for the Python Module Of the Week series. The code in the paver configuration file mentioned below , is copied and adapted from the pavement.py file distributed with the Python Module Of the Week source.

The following is assumed about the python source code:

The python source code is in a single file, which can be run to produce some output. For python code that provide functions rather than a script, this can be acheived by including a usage example under the if __name__ = “__main__” : construct. Any plots produced by the program are assumed to be saved in image files, and then linked in the documentation source file using the .. image:: reST directive.

We require that the documentation include the following:

  1. Description of the code and its usage.
  2. The entire code file, with syntax higlighting.
  3. Output obtained while running the code.

We use reST for writing the text of the documentation and convert it into html using docutils. The output of code execution is captured using the cog module. The steps for producing the html document are automated using the paver module.

For illustration we use a python file test.py containing a few functions for working with string representation of time and angles and a function that creates a simple plot using matplotlib and saves it into a file. Text in reST format used to produce this article is shown at the end of this article.

Tools needed

  1. Python (2.4 or above should work)
  2. docutilsThis package includes the rst2html.py program.
  3. pygmentsThis package includes the pygmentize program.
  4. paverThis package includes cog. See cog website for more on cog itself.
  5. sphinxcontrib.paverutils

Manual work flow

Syntax highlighting of source file

With Sphinx we can easily produce syntax highlighted source code for source code in a file that is separate from the documentation source file. With docutils we can include source code from files but cannot get them syntax highlighted. But docutils provides a away for inlcuding raw html from other files. So if we have a chunk of html of the syntax highlighted python code, then it can be included into the main documentation. To produce such a chunk of html we can use the pygmentize program from the Pygments package.

The following command will produce an html file with syntax highlighted code from the python file. The noclasses=True option will cause pygmentize to include the css styling inline rather than in the head of the html. We need this since we will be using this chunk of html inside the main html document and hence do not want elements such as head in this chunk.

pygmentize -O "noclasses=True" -o test.py.html test.py

Including the html chunk into the documentation file

The reST directive for including raw html is:

.. raw:: html
   :file: test.py.html

This directive will insert the contents of the file test.py.html without processing it.

The following block of text is the code in the file test.py, with syntax highlighting provided by pygmentize and inserted using the above directive:

"""Some functions for handling string representations of angle and time.

"""
import math
import matplotlib.pyplot as plt
import numpy as np

def babylonian_to_decimal(xyz,delimiter=None):
    """ Decimal value from numbers such as angles and time in "base 60" system.

    Given an angle or time converts it into decimal value. The input value can
    be either a floating point number or a string such as "hh mm ss.ss" or
    "dd mm ss.ss". Delimiters other than " " can be specified using the keyword
    delimiter.

    :param xyz:
        a positive floating point number or delimited string
    :type xyz:
        Float, Integer or String
    :param delimiter:
        delimiter used if xyz is a string. Default value is None, in which case
        " " is used.
    :type delimiter:
        String

    :returns:
        decimal_value : a floating point number 

    >>> babylonian_to_decimal("12:00:00.0",":")
    12.0
    >>> babylonian_to_decimal("12.0")
    12.0
    >>> babylonian_to_decimal(12.0)
    12.0
    >>> babylonian_to_decimal("12 30 00.00")
    12.5
    >>> babylonian_to_decimal("-12 30 00.00")
    -12.5
    >>> babylonian_to_decimal("-00 30 00.00")
    -0.5
    """
    divisors = [1.0,60.0,3600.0]
    xyzlist = str(xyz).split(delimiter)
    sign = -1 if xyzlist[0].find("-") != -1 else 1
    xyzlist = [abs(float(x)) for x in xyzlist]
    decimal_value = 0

    for i,j in zip(xyzlist,divisors):
        decimal_value += i/j

    decimal_value = -decimal_value if sign == -1 else decimal_value
    return decimal_value

def hms_to_decimaldeg(hms,delimiter=None):
    """Return decimal degrees given hours, minutes and seconds.

    :param hms:
        floating point number or delimited string specifying hours,
        minutes and seconds.
    :type hms:
        Float or String
    :param delimiter:
        delimiter used in string, default is whitespace.
    :type delimiter:
        String
       
    :returns:
        floating point decimal degrees.

    >>> hms_to_decimaldeg("12:00:00",delimiter=":")
    180.0
    >>> hms_to_decimaldeg("12.0")
    180.0
    >>> hms_to_decimaldeg(12.0)
    180.0
    >>> hms_to_decimaldeg("12 30 00.00")
    187.5
    >>> hms_to_decimaldeg("-12 30 00.00")
    -187.5
    >>> hms_to_decimaldeg(-12.5)
    -187.5
    >>> hms_to_decimaldeg("-00 30 00")
    -7.5
    """
    decimal_value = babylonian_to_decimal(hms,delimiter=delimiter)
    return decimal_value/24.0*360.0

def decimaldeg_to_hms(deg,delimiter=None,norm=False):
    """Convert degrees into delimited string of hours, minutes and seconds.
    """
    if deg < 0 :
        negative = True
    else:
        negative = False

    deg = abs(float(deg)) * 24.0 / 360.0 # remove negative sign; put back at
                                         # the end
    minutes, hours = math.modf(deg)
    hours = int(hours) # hours is integer valued but type is float
    seconds, minutes = math.modf(minutes*60.0)
    minutes = int(minutes) # minutes is integer valued but type is float
    seconds = seconds*60.0 # number of seconds between 0 and 60

    # Keep seconds and minutes in [0 - 60.0000)
    if abs(seconds - 60.0) < 0.0001:
        seconds = 0.0
        minutes += 1
    if abs(minutes - 60.0) < 0.0001:
        minutes = 0.0
        hours += 1

    if norm:
        hours = (hours % 24) # 52 hours => 4 hours on 24 hour cycle
                             # hours is integer number of hours
                             # so no need to adjust minutes and seconds
    if not delimiter:
        delimiter = " "

    if negative: # put negative sign back
        hours = -hours

    hmsstring = "%+03d %02d %07.4f" % (hours,abs(minutes),abs(seconds))
    hmsstring = hmsstring.replace(" ",delimiter)

    return hmsstring

def test_plot():
    """Creates a simple sin(x) plot and saves it into test.png"""
    x = (np.arange(1,1000.1,0.1)/1000.0)*2*np.pi
    y = np.sin(x)
    plt.plot(x,y,".")
    plt.savefig("test.png")
    plt.close()

if __name__ == '__main__':
    print decimaldeg_to_hms(23.5678)
    print decimaldeg_to_hms(-23.5678)
    print hms_to_decimaldeg("+01 34 16.2720")
    print hms_to_decimaldeg("-01 34 16.2720")
    test_plot()

Running code and inserting output into the documentation file

cog can be used on its own as described at the cog website. In this case the code to be executed is provided within the cog specification and the result is inserted right where the cog specification is provided. Here we want to execute code in a separate file and then insert the results into the documnetation source. For this we use cog in conjunction with paver.

To do this we specify the following cog specification in the main documentation file, at the point where the result should be inserted. The leading “..” is to ensure that docutils, used to convert the documentation file into html, will treat it as comments and hence ignore it. When cog is run on the file the space between the line containing the three closing braces and the line containing “end” between three opening and closing braces will be replaced with the output from the program.

The*cog* specification is shown below.

.. {{{cog
.. cog.out(run_script(cog.inFile, 'test.py', ignore_error=True))
.. }}}
.. {{{end}}}

Note

To show the cog markup used, it is included in a separate file, cog_spec.txt and then the reST directive:

.. inlcude:: cog_spec.txt
   :literal:

is used in the documentation source file to insert its contents into the final html file. This step is needed since cog is run on the reST file used to produce this document and cog will try to process this markup if it is directly included in the reST file.

Output obtained from running cog on test.py:



$ python test.py
+01 34 16.2720
-01 34 16.2720
23.5678
-23.5678


Plots can be saved to a file and then included in the documentation file using the following directive:

.. image:: filename.png

In the test.py file used in this example, the plot is saved in an image from within the test_plot function. To include the image in the html document we use the following reST directive:

.. image:: test.png
   :scale: 50 %
   :alt: A Simple Plot
   :align: center

The plot produced by the function test_plot in test.py is shown below:

A simple plot

Producing html version of the documentation

We run the program rst2html.py, that comes with docutils, to convert our documentation file into html.

rst2html.py --no-xml-declaration index.rst index.rst.html

Tip

If we want to syntax highlight any python code present inside our reST source file then use the rst2html-pygments.py program. The code to be syntax highlighted must be included under the .. code-block:: python directive.

For more about rst2html-pygments.py program refer to this url: http://docutils.sourceforge.net/sandbox/code-block-directive/data/

Automating work flow with paver

We provide paver tasks for each of the steps above. We also create a task for running all the three steps in one go. The paver configuration file pavement.py can be edited to change the name of python code file and the documentation text. Here these two are named test.py and index.rst respectively.

To generate the html chunk of syntax highlighted code we can issue the command:

paver py_html

To run the code and capture the outout we run:

paver cog

Finally, to generate the html version of the documentation we run:

paver rst_html

The three steps can be executed with one command:

paver build

Paver configuration file

Contents of the pavement.py file is listed below. The blog task is to clean the html so that it can be easily posted to wordpress.com.

# -*- coding: utf-8 -*-
# References
# ----------
#
# .. _Doug Hellmann's descripition of his tool chain: 
#    http://blog.doughellmann.com/2009/02/writing-technical-documentation-with.html
# .. _Python Module Of the Week: http://www.doughellmann.com/PyMOTW/
#
import paver
import paver.doctools
import paver.misctasks
from paver.path import path
from paver.easy import *
import paver.setuputils
paver.setuputils.install_distutils_tasks()
from sphinxcontrib import paverutils

options (
    cog = Bunch(
        beginspec='{{{cog',
        endspec='}}}',
        endoutput='{{{end}}}',
        basedir='.',
    ),
    py_html = Bunch(
        infile="test.py",
    ),
    rst_html = Bunch(
        infile="index.rst",
    ),
    blog = Bunch(
        infile="index.rst.html",
    ),
    build = Bunch(
        ),
)

from paver.doctools import Includer, _cogsh

def _runcog(options, files, uncog=False):
    """Common functions for the cog and runcog tasks."""

    from paver.cog import Cog
    options.order('cog', add_rest=True)
    c = Cog()
    if uncog:
        c.options.bNoGenerate = True
    c.options.bReplace = True
    c.options.bDeleteCode = options.get("delete_code", False)

    c.sBeginSpec = options.get('beginspec', '[[[cog')
    c.sEndSpec = options.get('endspec', ']]]')
    c.sEndOutput = options.get('endoutput', '[[[end]]]')

    basedir = options.get('basedir', None)
    basedir = path(basedir)

    if not files:
        pattern = options.get("pattern", "*.rst")
        if pattern:
            files = basedir.walkfiles(pattern)
        else:
            files = basedir.walkfiles()
    for f in files:
        dry("cog %s" % f, c.processOneFile, f)
#

@task
@consume_args
def cog(options):
    options.order('cog', add_rest=True)
    # Figure out if we were given a filename or
    # directory, and scan the directory for files
    # if we need to.
    files_to_cog = getattr(options, 'args', [])
    if files_to_cog and os.path.isdir(files_to_cog[0]):
        dir_to_scan = path(files_to_cog[0])
        files_to_cog = dir_to_scan.walkfiles(options.get("pattern", "*.rst"))
    _runcog(options, files_to_cog)
    return

@task
def py_html(options):
    sh('pygmentize -O "noclasses=True" -o %s %s' % \
       (options.py_html.infile+".html", options.py_html.infile ))

@task
def rst_html(options):
    sh('rst2html.py --no-xml-declaration %s %s' % \
       (options.rst_html.infile, options.rst_html.infile+".html"))

def clean_blog_html(body):
    # Clean up the HTML
    from BeautifulSoup import BeautifulSoup as BSoup
    soup = BSoup(body)
    # Replace newlines in a paragraph and then add a newline at the end.
    for i in soup('p'):
      i.replaceWith(i.renderContents().replace("\n"," ")+"\n")

    s = soup.body.renderContents()

    return s

@task
def blog(options):
    infile = open(options.blog.infile,"r")
    body = "".join(infile.readlines())
    infile.close()
    body = clean_blog_html(body)
    names = options.blog.infile.split(".")
    outfile = open(".".join(names[:-1])+".blog."+names[-1],"w")
    outfile.writelines(body)
    outfile.close()

@task
def build(options):
    cog(options.cog)
    py_html(options.py_html)
    rst_html(options.rst_html)
    blog(options.blog)

Summary

We need to produce documentation in html format, for python code in a single file, from a plain text documentation source written in reStructured Text format.

We convert the python source into a chunk of html with the code syntax highlighted using the program pygmentize that comes with the pygments package. This chunk of html is refered to in the documentation source using the ..raw:: html directive.

We then run cog, from within paver, on the documentation source in reStructured Text format to execute the python code and insert the text output from it into the documentation source file. The python code also saves a plot into an image file, which we refer to in the documentation source file using the .. image:: directive.

We then run rst2html.py program on the documentation source file, to produce the final html file.

Concluding remarks

We can easily inlcude more tasks in the pavement.py file and incorporate more involved conversions. Here we aim for a simple conversion. For more involved documentation, we can always use the facilities provided by Sphinx.

reStructured Text source used in producing this html file

The reST file is show below. Note that this file has the output of the cog run, shown in the block of text:

::

   $ python test.py
   +01 34 16.2720
   -01 34 16.2720
   23.5678
   -23.5678

inside the cog specification, in the section titled “Running code and inserting output into the documentation file”. While writing the reST text, we obviously do not write this block of text.

====================================================================================
Producing documentation for python code in a file with reST, docutils, cog and paver
====================================================================================

.. _Sphinx: http://sphinx.pocoo.org/
.. _reST: http://docutils.sourceforge.net/rst.html
.. _reStructured Text: http://docutils.sourceforge.net/rst.html
.. _docutils: http://docutils.sourceforge.net/
.. _cog: http://nedbatchelder.com/code/cog/
.. _paver: http://www.blueskyonmars.com/projects/paver/
.. _Doug Hellmann's descripition of his tool chain:
   http://blog.doughellmann.com/2009/02/writing-technical-documentation-with.html
.. _Python Module Of the Week: http://www.doughellmann.com/PyMOTW/
.. _matplotlib: http://matplotlib.sf.net/
.. _rst2html-highlight: http://docutils.sourceforge.net/sandbox/code-block-directive/
.. _Pygments: http://pygments.org/
.. _Python: http://www/python.org/
.. _sphinxcontrib.paverutils: www.doughellmann.com/projects/sphinxcontrib.paverutils/

.. contents::

Introduction
============

Documentation in multiple formats, from documentation source in
`reStructured Text`_ (reST) plain text markup format, for code in a
python project can be produced using the python package
`Sphinx`_. *Sphinx* is especially suited for handling cases where the
documentation source spans multiple files. It can be used to produce
table of contents, cross-references, global index, module index and
even provides search functionality. For python code in one source
file, documentation can be usually provided using a single
documentation source file.  *Sphinx* can be used in such cases, but
most of the features provided by *Sphinx* are not needed.

The following describe **a simple recipe for producing html
documentation, from plain text documentation source in a single file
written in the** `reST`_ **format, for python code in a single
file**. These ideas closely follow `Doug Hellmann's descripition of his
tool chain`_ for producing documentation for the `Python Module Of the
Week`_ series. The code in the `paver configuration file mentioned
below`_ , is copied and adapted from the *pavement.py* file
distributed with the *Python Module Of the Week* source.

The following is assumed about the python source code:

The python source code is in a single file, which can be run to produce
some output. For python code that provide functions rather than a
script, this can be acheived by including a usage example under the *if
__name__ = "__main__" :* construct. Any plots produced by the program
are assumed to be saved in image files, and then linked in the
documentation source file using the *.. image:: reST directive*.

We require that the documentation include the following:

#. Description of the code and its usage.
#. The entire code file, with syntax higlighting.
#. Output obtained while running the code.

We use `reST`_ for writing the text of the documentation and convert
it into html using `docutils`_. The output of code execution is
captured using the `cog`_ module. The steps for producing the html
document are automated using the `paver`_ module.

For illustration we use a python file `test.py` containing a few
functions for working with string representation of time and angles
and a function that creates a simple plot using `matplotlib`_ and
saves it into a file. Text in *reST* format used to produce this article
is `shown at the end of this article`_.

Tools needed
============

#. Python_ (2.4 or above should work)

#. `docutils`_

   This package includes the rst2html.py program.

#. `pygments`_

   This package includes the pygmentize program.

#. `paver`_

   This package includes cog. See `cog website`__ for more on *cog*
   itself.

#. `sphinxcontrib.paverutils`_

__ cog_

Manual work flow
================

Syntax highlighting of source file
----------------------------------

With `Sphinx` we can easily produce syntax highlighted source code for
source code in a file that is separate from the documentation source
file. With `docutils` we can include source code from files but cannot
get them syntax highlighted. But `docutils` provides a away for
inlcuding *raw html* from other files. So if we have a chunk of html
of the syntax highlighted python code, then it can be included into
the main documentation. To produce such a chunk of html we can use the
`pygmentize` program from the `Pygments`_ package.

The following command will produce an html file with syntax
highlighted code from the python file. The ``noclasses=True`` option
will cause pygmentize to include the css styling inline rather than in
the head of the html. We need this since we will be using this chunk
of html inside the main html document and hence do not want elements
such as ``head`` in this chunk.

::

    pygmentize -O "noclasses=True" -o test.py.html test.py

Including the html chunk into the documentation file
----------------------------------------------------

The reST directive for including raw html is::

    .. raw:: html
       :file: test.py.html

This directive will insert the contents of the file *test.py.html*
without processing it.

The following block of text is the code in the file *test.py*, with
syntax highlighting provided by *pygmentize* and inserted using the
above directive:

.. raw:: html
   :file: test.py.html

Running code and inserting output into the documentation file
-------------------------------------------------------------

*cog* can be used on its own as described at the `cog website`__. In
this case the code to be executed is provided within the *cog*
specification and the result is inserted right where the *cog*
specification is provided. Here we want to execute code in a separate
file and then insert the results into the documnetation source. For
this we use *cog* in conjunction with *paver*.

__ cog_

To do this we specify the following *cog* specification in the main
documentation file, at the point where the result should be
inserted. The leading ".." is to ensure that `docutils`, used to
convert the documentation file into html, will treat it as comments
and hence ignore it. When *cog* is run on the file the space between
the line containing the three closing braces and the line containing
"end" between three opening and closing braces will be replaced with
the output from the program.

The*cog* specification is shown below.

.. include:: cog_spec.txt
   :literal:

.. note::

   To show the *cog* markup used, it is included in a separate file,
   *cog_spec.txt* and then the *reST* directive::

      .. inlcude:: cog_spec.txt
         :literal:

   is used in the documentation source file to insert its contents
   into the final html file. This step is needed since *cog* is run on
   the *reST* file used to produce this document and *cog* will try to
   process this markup if it is directly included in the *reST* file.

Output obtained from running *cog* on *test.py*:

.. {{{cog
.. cog.out(run_script(cog.inFile, 'test.py', ignore_error=True))
.. }}}

::

        $ python test.py
        +01 34 16.2720
        -01 34 16.2720
        23.5678
        -23.5678

.. {{{end}}}

Plots can be saved to a file and then included in the documentation
file using the following directive::

     .. image:: filename.png

In the *test.py* file used in this example, the plot is saved in an
image from within the *test_plot* function. To include the image in
the html document we use the following *reST* directive::

   .. image:: test.png
      :scale: 50 %
      :alt: A Simple Plot
      :align: center

The plot produced by the function *test_plot* in *test.py* is
shown below:

.. image:: test.png
   :scale: 50 %
   :alt: A Simple Plot
   :align: center

Producing html version of the documentation
-------------------------------------------

We run the program **rst2html.py**, that comes with *docutils*, to
convert our documentation file into html.

::

   rst2html.py --no-xml-declaration index.rst index.rst.html

.. tip::

   If we want to syntax highlight any python code present inside our
   *reST* source file then use the **rst2html-pygments.py**
   program. The code to be syntax highlighted must be included under
   the `.. code-block:: python` directive.

   For more about rst2html-pygments.py program refer to this url:
   http://docutils.sourceforge.net/sandbox/code-block-directive/data/

Automating work flow with paver
===============================

We provide *paver tasks* for each of the steps above. We also create a
task for running all the three steps in one go. The *paver*
configuration file *pavement.py* can be edited to change the name of
python code file and the documentation text. Here these two are named
*test.py* and *index.rst* respectively.

To generate the html chunk of syntax highlighted code we can issue the
command::

    paver py_html

To run the code and capture the outout we run::

    paver cog

Finally, to generate the html version of the documentation we run::

    paver rst_html

The three steps can be executed with one command::

    paver build

Paver configuration file
------------------------
.. _wordpress.com: http://www.wordpress.com

.. _paver configuration file mentioned below:

Contents of the `pavement.py` file is listed below. The *blog task* is
to clean the html so that it can be easily posted to `wordpress.com`_.

.. raw:: html
   :file: pavement.py.html

Summary
=======

We need to produce documentation in *html* format, for python code in
a single file, from a plain text documentation source written in
*reStructured Text* format.

We convert the python source into a chunk of html with the code syntax
highlighted using the program *pygmentize* that comes with the
*pygments* package. This chunk of html is refered to in the
documentation source using the *..raw:: html* directive.

We then run *cog*, from within *paver*, on the documentation source in
*reStructured Text* format to execute the python code and insert the
text output from it into the documentation source file. The python
code also saves a plot into an image file, which we refer to in the
documentation source file using the *.. image::* directive.

We then run *rst2html.py* program on the documentation source file, to
produce the final html file.

Concluding remarks
==================

We can easily inlcude more tasks in the `pavement.py` file and
incorporate more involved conversions. Here we aim for a simple
conversion. For more involved documentation, we can always use the
facilities provided by `Sphinx`.

*reStructured Text* source used in producing this html file
===========================================================

The *reST* file is show below. Note that this file has the output of
the *cog* run, shown in the block of text::

     ::

        $ python test.py
        +01 34 16.2720
        -01 34 16.2720
        23.5678
        -23.5678

inside the *cog* specification, in the section titled "Running code
and inserting output into the documentation file". While writing the
*reST* text, we obviously do not write this block of text.

.. _shown at the end of this article:

.. include:: index.rst
   :literal:
Advertisements
This entry was posted in Python and tagged , , , , . Bookmark the permalink.

2 Responses to Documenting python code with reST, docutils, pygments, cog and paver

  1. Günter Milde says:

    Since Release 0.9 (2012-05-02), Docutils supports syntax highlight of code with the
    reStructuredText “code” role and directive and the “code” option of the “include” directive.
    Syntax highlighting is done by Pygments.
    See http://docutils.sourceforge.net/docs/ref/rst/directives.html#code

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s