Documenting Your Python Project-part1

Exclusive offer: get 50% off this eBook here
Expert Python Programming

Expert Python Programming — Save 50%

Best practices for designing, coding, and distributing your Python software

£16.99    £8.50
by Tarek Ziadé | May 2009 | Open Source

This is a 2-part article series by Tarek Ziadé, which is all about documentation and gives tips on technical writing and how Python projects should be documented. In the first part of this series, you will learn about the 7 golden rules of technical writing and understand reStructuredText Primer. In the next part, you will learn how to build the documentation.

Documenting Your Project

Documentation is work that is often neglected by developers and sometimes by managers. This is often due to a lack of time towards the end of development cycles, and the fact that people think they are bad at writing. Some of them are bad, but the majority of them are able to produce fine documentation.

In any case, the result is a disorganized documentation made of documents that are written in a rush. Developers hate doing this kind of work most of the time. Things get even worse when existing documents need to be updated. Many projects out there are just providing poor, out-of-date documentation because the manager does not know how to deal with it.

But setting up a documentation process at the beginning of the project and treating documents as if they were modules of code makes documenting easier. Writing can even be fun when a few rules are followed.

This article provides a few tips to start documenting your project through:

  • The seven rules of technical writing that summarize the best practices
  • A reStructuredText primer, which is a plain text markup syntax used in most Python projects
  • A guide for building good project documentation

The Seven Rules of Technical Writing

Writing good documentation is easier in many aspects than writing a code. Most developers think it is very hard, but by following a simple set of rules it becomes really easy.

We are not talking here about writing a book of poems but a comprehensive piece of text that can be used to understand a design, an API, or anything that makes up the code base.

Every developer is able to produce such material, and this section provides seven rules that can be applied in all cases.

  • Write in two steps: Focus on ideas, and then on reviewing and shaping your text.
  • Target the readership: Who is going to read it?
  • Use a simple style: Keep it straight and simple. Use good grammar.
  • Limit the scope of the information: Introduce one concept at a time.
  • Use realistic code examples: Foos and bars should be dropped.
  • Use a light but sufficient approach: You are not writing a book!
  • Use templates: Help the readers to get habits.

These rules are mostly inspired and adapted from Agile Documenting, a book by Andreas Rüping that focuses on producing the best documentation in software projects.

Write in Two Steps

Peter Elbow, in Writing with Power, explains that it is almost impossible for any human being to produce a perfect text in one shot. The problem is that many developers write documentation and try to directly come up with a perfect text. The only way they succeed in this exercise is by stopping the writing after every two sentences to read them back, and do some corrections. This means that they are focusing both on the content and the style of the text.

This is too hard for the brain and the result is often not as good as it could be. A lot of time and energy is spent in polishing the style and shape of the text, before its meaning is completely thought through.

Another approach is to drop the style and organization of the text and focus on its content. All ideas are laid down on paper, no matter how they are written. The developer starts to write a continuous stream and does not pause when he or she makes grammatical mistakes, or for anything that is not about the content. For instance, it does not matter if the sentences are barely understandable as long as the ideas are written down. He or she just writes down what he wants to say, with a rough organization.

By doing this, the developer focuses on what he or she wants to say and will probably get more content out of his or her brain than he or she initially thought he or she would.

Another side-effect when doing free writing is that other ideas that are not directly related to the topic will easily go through the mind. A good practice is to write them down on a second paper or screen when they appear, so they are not lost, and then get back to the main writing.

The second step consists of reading back the whole text and polishing it so that it is comprehensible to everyone. Polishing a text means enhancing its style, correcting its faults, reorganizing it a bit, and removing any redundant information it has.

When the time dedicated to write documentation is limited, a good practice is to cut this time in two equal durations—one for writing the content, and one to clean and organize the text.

Focus on the content, and then on style and cleanliness.

Target the Readership

When starting a text, there is a simple question the writer should consider: Who is going to read it?

This is not always obvious, as a technical text explains how a piece of software works, and is often written for every person who might get and use the code. The reader can be a manager who is looking for an appropriate technical solution to a problem, or a developer who needs to implement a feature with it. A designer might also read it to know if the package fits his or her needs from an architectural point of view.

Let's apply a simple rule: Each text should have only one kind of readers.

This philosophy makes the writing easier. The writer precisely knows what kind of reader he or she is dealing with. He or she can provide a concise and precise documentation that is not vaguely intended for all kinds of readers.

A good practice is to provide a small introductory text that explains in one sentence what the documentation is about, and guides the reader to the appropriate part:

Atomisator is a product that fetches RSS feeds and saves them in a
database, with a filtering process.
If you are a developer, you might want to look at the API description
(api.txt)
If you are a manager, you can read the features list and the FAQ
(features.txt)
If you are a designer, you can read the architecture and
infrastructure notes (arch.txt)

By taking care of directing your readers in this way, you will probably produce better documentation.

Know your readership before you start to write.

Use a Simple Style

Seth Godin is one of the best-selling writers on marketing topics. You might want to read Unleashing the Ideavirus, which is available for free on the Internet http://en.wikipedia.org/wiki/Unleashing_the_Ideavirus.

Lately, he made an analysis on his blog to try to understand why his books sold so well. He made a list of all best sellers in the marketing area and compared the average number of words per sentences in each one of them.

He realized that his books had the lowest number of words per sentence (thirteen words). This simple fact, Seth explained, proved that readers prefer short and simple sentences, rather than long and stylish ones.

By keeping sentences short and simple, your writings will consume less brain power for their content to be extracted, processed, and then understood. Writing technical documentation aims to provide a software guide to readers. It is not a fiction story, and should be closer to your microwave notice than to the latest Stephen King novel.

A few tips to keep in mind are:

  • Use simple sentences; they should not be longer than two lines.
  • Each paragraph should be composed of three or four sentences, at the most, that express one main idea. Let your text breathe.
  • Don't repeat yourself too much: Avoid journalistic styles where ideas are repeated again and again to make sure they are understood.
  • Don't use several tenses. Present tense is enough most of the time.
  • Do not make jokes in the text if you are not a really fine writer. Being funny in a technical book is really hard, and few writers master it. If you really want to distill some humor, keep it in code examples and you will be fine.

You are not writing fiction, so keep the style as simple as possible.

Expert Python Programming Best practices for designing, coding, and distributing your Python software
Published: September 2008
eBook Price: £16.99
Book Price: £27.99
See more
Select your format and quantity:

Limit the Scope of the Information

There's a simple sign of bad documentation in software: You are looking for some information that you know is present somewhere, but you cannot find it. After spending some time reading the table of contents, you are starting to grep the files trying several word combinations but cannot get what you are looking for.

This happens when writers are not organizing their texts in topics. They might provide tons of information, but it is just gathered in a monolithic or non-logical way. For instance, if a reader is looking for a big picture of your application, he or she should not have to read the API documentation: that is a low-level matter.

To avoid this effect, paragraphs should be gathered under a meaningful title for a given section, and the global document title should synthesize the content in a short phrase.

A table of contents could be made of all the section's titles.

A simple practice to compose your titles is to ask yourself: What phrase would I type in Google to find this section?

Use Realistic Code Examples

Foo and bar are bad citizens. When a reader tries to understand how a piece of code works with a usage example, having an unrealistic example will make it harder to understand.

Why not use a real-world example? A common practice is to make sure that each code example can be cut and pasted in a real program.

An example of bad usage is:

We have a parse function:

>>> from atomisator.parser import parse

Let's use it:

>>> stuff = parse('some-feed.xml')
>>> stuff.next()
{'title': 'foo', 'content': 'blabla'}

A better example would be when the parser knows how to return a feed content with the parse function, available as a top-level function:

>>> from atomisator.parser import parse

Let's use it:

>>> my_feed = parse('http://tarekziade.wordpress.com/feed')
>>> my_feed.next()
{'title': 'eight tips to start with python',
'content': 'The first tip is..., ...'}

This slight difference might sound overkill, but in fact it makes your documentation a lot more useful. A reader can copy those lines into a shell, understands that parse uses a URL as a parameter, and that it returns an iterator that contains blog entries.

Code examples should be directly reusable in real programs.

Use a Light but Sufficient Approach

In most agile methodologies, documentation is not the first citizen. Making software that works is the most important thing, over detailed documentation. So a good practice, as Scott Ambler explains in his book Agile Modeling: Effective Practices for Extreme Programming and the Unified Process is to define the real documentation needs, rather than creating an exhaustive set of documents.

For instance, a single document that explains how Atomisator works for administrators is sufficient. There is no other need for them than to know how to configure and run the tool. This document limits its scope to answer to one question: How do I run Atomisator on my server?

Besides readership and scope, limiting the size of each section written for the software to a few pages is a good idea. By making each section four pages long at the most, the writer will have to synthesize his or her thought. If it needs more, it probably means that the software is too complex to explain or use.

Use Templates

Every page on Wikipedia is similar. There are boxes on the left side that are used to summarize dates or facts. At the beginning of the document is a table of contents with links that refer to anchors in the same text. There is always a reference section at the end.

Users get used to it. For instance, they know they can have a quick look at the table of contents, and if they do not find the info they are looking for, they will go directly to the reference section to see if they can find another website on the topic. This works for any page on Wikipedia. You learn the Wikipedia way to be more efficient.

So using templates forces a common pattern for documents, and therefore makes people more efficient in using them. They get used to the structure and know how to read it quickly.

Providing a template for each kind of document also provides a quick start for writers.

In this article, we will see the various kinds of documents a piece of software can have, and use Paster to provide skeletons for them. But the first thing to do is to describe the markup syntax that should be used in Python documentation.

A reStructuredText Primer

reStructuredText is also called reST (http://docutils.sourceforge.net/rst.html). It is a plain text markup language widely used in the Python community to document packages. The great thing about reST is that the text is still readable since the markup syntax does not obfuscate the text like LaTeX would.

Here's a sample of such a document:

=====
Title
=====
Section 1
=========
This *word* has emphasis.
Section 2
=========
Subsection
::::::::::
Text.

reST comes in docutils, a package that provides a suite of scripts to transform a reST file to various formats, such as HTML, LaTeX, XML, or even S5, Eric Meyer's slide show system (http://meyerweb.com/eric/tools/s5).

Writers can focus on the content and then decide how to render it, depending on the needs. For instance, Python itself is documented into reST, which is then rendered in HTML to build http://docs.python.org, and in various other formats.

The minimum elements one should know to start writing reST are:

  • Section structure
  • Lists
  • Inline markup
  • Literal block
  • Links

This section is a really fast overview of the syntax. A quick reference is available for more information at: http://docutils.sourceforge.net/docs/user/rst/quickref.html, which is a good place to start working with reST.

To install reStructuredText, install docutils:

$ easy_install docutils

You will get a set of scripts starting with rst2, to be able to render reST in various formats.

For instance, the rst2html script will produce HTML output given an reST file:

$ more text.txt
Title
=====
content.
$ rst2html.py text.txt > text.html
$ more text.html
<?xml version="1.0" encoding="utf-8" ?>
...
<html ...>
<head>
...
</head>
<body>
<div class="document" id="title">
<h1 class="title">Title</h1>
<p>content.</p>
</div>
</body>
</html>

Section Structure

The document's title and its sections are underlined using non-alphanumeric characters. They can be overlined and underlined, and a common practice is to use this double markup for the title, and keep a simple underline for sections.

The most used characters to underline a section title are in the following order of precedence: =, -, _, :, #, +, ^.

When a character is used for a section, it is associated with its level and it has to be used consistently throughout the document.

For example:

=====
Title
=====
Section 1
=========
xxx
Subsection A
------------
xxx
Subsection B
------------
xxx
Section 2
=========
xxx
Subsection C
------------
xxx

Expert Python Programming

The HTML output of this file will look like the illustration shown above.

Lists

reST provides bullet, and enumerated and definition lists with auto-enumeration features:

Bullet list:
- one
- two
- three
Enumerated list:
1. one
2. two
#. auto-enumerated
Definition list:
one
one is a number.
two
two is also a number.

Inline Markup

Text can be styled using an inline markup:

  • *emphasis*: Italics
  • **strong emphasis**: Boldface
  • ``inline literal``: Inline preformatted text
  • `a text with a link`_: This will be replaced by a hyperlink as long as it is provided in the document (see in the Links section).

Literal Block

When you need to present some code examples, a literal block can be used. Two colons are used to mark the block, which is an indented paragraph:

This is a code example
::
>>> 1 + 1
2
Let's continue our text

Don't forget to add a blank line after :: and after the block, otherwise it will not be rendered.

Notice that the colon characters can be put in a text line. In that case, they will be replaced by a single colon in the various rendering formats:

This is a code example::
>>> 1 + 1
2
Let's continue our text

If you don't want to keep a single colon, you can insert a space between example and ::. In that case, :: will be interpreted and totally removed.

Links

A text can be changed into an external link with a special line starting with two dots, as long as it is provided in the document:

Try `Plone CMS`_, it is great ! It is based on Zope_.
.. _`Plone CMS`: http://plone.org
.. _Zope: http://zope.org

A usual practice is to group the external links at the end of the document. When the text to be linked contains spaces, it has to be surrounded with ` characters.

Internal links can also be used by adding a marker in the text:

This is a code example
.. _example:
::
>>> 1 + 1
2
Let's continue our text, or maybe go back to
the example_.

Sections are also targets that can be used:

=====
Title
=====
Section 1
=========
xxx
Subsection A
------------
xxx
Subsection B
------------
-> go back to `Subsection A`_
Section 2
=========
xxx

Summary

In this article, we learnt about the 7 technical writing rules and also about the reStructuredText Primer. In the next part, we will look at how to build the documentation.

Expert Python Programming Best practices for designing, coding, and distributing your Python software
Published: September 2008
eBook Price: £16.99
Book Price: £27.99
See more
Select your format and quantity:

About the Author :


Tarek Ziadé

Tarek Ziadé is CTO at Ingeniweb in Paris, working on Python, Zope, and Plone technology and on Quality Assurance. He has been involved for 5 years in the Zope community and has contributed to the Zope code itself.

Tarek has also created Afpy, the French Python User Group and has written two books in French about Python. He has gave numerous talks and tutorials in French and international events like Solutions Linux, Pycon, OSCON, and EuroPython.

Contact Tarek Ziadé

Books From Packt

 

eZ Publish 4: Enterprise Web Sites Step-by-Step
eZ Publish 4: Enterprise Web Sites Step-by-Step

Apache Maven 2 Effective Implementations: RAW
Apache Maven 2 Effective Implementations: RAW

Building Enterprise Ready Telephony Systems with sipXecs 4.0: RAW
Building Enterprise Ready Telephony Systems with sipXecs 4.0: RAW

Pentaho Reporting 1.0 for Java Developers
Pentaho Reporting 1.0 for Java Developers

Scratch 1.3: Beginner’s Guide
Scratch 1.3: Beginner’s Guide

WordPress 2.7 Cookbook
WordPress 2.7 Cookbook

Asterisk 1.4 – the Professional’s Guide
Asterisk 1.4 – the Professional’s Guide

Drools JBoss Rules 5.0 Developer's Guide
Drools JBoss Rules 5.0 Developer's Guide

 

Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software