Basic Doctest in Python

January 2010

Doctest will be the mainstay of your testing toolkit. You'll be using it for tests, of course, but also for things that you may not think of as tests right now. For example, program specifications and API documentation both benefit from being written as doctests and checked alongside your other tests.

Like program source code, doctest tests are written in plain text. Doctest extracts the tests and ignores the rest of the text, which means that the tests can be embedded in human-readable explanations or discussions. This is the feature that makes doctest so suitable for non-classical uses such as program specifications.

Time for action – creating and running your first doctest

We'll create a simple doctest, to demonstrate the fundamentals of using doctest.

  1. Open a new text file in your editor, and name it test.txt.
  2. Insert the following text into the file:
      This is a simple doctest that checks some of Python's arithmetic
      >>> 2 + 2
      >>> 3 * 3
  3. We can now run the doctest. The details of how we do that depend on which version of Python we're using. At the command prompt, change to the directory where you saved test.txt.
  4. If you are using Python 2.6 or higher, type:
      $ python -m doctest test.txt
  5. If you are using python 2.5 or lower, the above command may seem to work, but it won't produce the expected result. This is because Python 2.6 is the first version in which doctest looks for test file names on the command line when you invoke it this way.
  6. If you're using an older version of Python, you can run your doctest by typing:
      $ python -c "__import__('doctest').testfile('test.txt')"
  7. When the test is run, you should see output as shown in the following screen:

What just happened?

You wrote a doctest file that describes a couple of arithmetic operations, and executed it to check whether Python behaved as the tests said it should. You ran the tests by telling Python to execute doctest on the files that contained the tests.

In this case, Python's behavior differed from the tests because according to the tests, three times three equals ten! However, Python disagrees on that. As doctest expected one thing and Python did something different, doctest presented you with a nice little error report showing where to find the failed test, and how the actual result differed from the expected result. At the bottom of the report, is a summary showing how many tests failed in each file tested, which is helpful when you have more than one file containing tests.

Remember, doctest files are for computer and human consumption. Try to write the test code in a way that human readers can easily understand, and add in plenty of plain language commentary.

The syntax of doctests

You might have guessed from looking at the previous example: doctest recognizes tests by looking for sections of text that look like they've been copied and pasted from a Python interactive session. Anything that can be expressed in Python is valid within a doctest.

Lines that start with a >>> prompt are sent to a Python interpreter. Lines that start with a ... prompt are sent as continuations of the code from the previous line, allowing you to embed complex block statements into your doctests. Finally, any lines that don't start with >>> or ..., up to the next blank line or >>> prompt, represent the output expected from the statement. The output appears as it would in an interactive Python session, including both the return value and the one printed to the console. If you don't have any output lines, doctest assumes it to mean that the statement is expected to have no visible result on the console.

Doctest ignores anything in the file that isn't part of a test, which means that you can place explanatory text, HTML, line-art diagrams, or whatever else strikes your fancy in between your tests. We took advantage of that in the previous doctest, to add an explanatory sentence before the test itself.

Time for action – writing a more complex test

We'll write another test (you can add it to test.txt if you like) which shows off most of the details of doctest syntax.

  1. Insert the following text into your doctest file (test.txt), separated from the existing tests by at least one blank line:
    Now we're going to take some more of doctest's syntax for a spin.
    >>> import sys
    >>> def test_write():
    ... sys.stdout.write("Hellon")
    ... return True
    >>> test_write()

    Think about it for a moment: What does this do? Do you expect the test to pass, or to fail?

  2. Run doctest on the test file, just as we discussed before. Because we added the new tests to the same file containing the tests from before, we still see the notification that three times three does not equal ten. Now, though, we also see that five tests were run, which means our new tests ran and succeeded.

What just happened?

As far as doctest is concerned, we added three tests to the file.

  • The first one says that when we import sys, nothing visible should happen.
  • The second test says that when we define the test_write function, nothing visible should happen.
  • The third test says that when we call the test_write function, Hello and True should appear on the console, in that order, on separate lines.

Since all three of these tests pass, doctest doesn't bother to say much about them. All it did was increase the number of tests reported at the bottom from two to five.

Expecting exceptions

That's all well and good for testing that things work as expected, but it is just as important to make sure that things fail when they're supposed to fail. Put another way; sometimes your code is supposed to raise an exception, and you need to be able to write tests that check that behavior as well.

Fortunately, doctest follows nearly the same principle in dealing with exceptions, that it does with everything else; it looks for text that looks like a Python interactive session. That means it looks for text that looks like a Python exception report and traceback, matching it against any exception that gets raised.

Doctest does handle exceptions a little differently from other tools. It doesn't just match the text precisely and report a failure if it doesn't match. Exception tracebacks tend to contain many details that are not relevant to the test, but which can change unexpectedly. Doctest deals with this by ignoring the traceback entirely: it's only concerned with the first line—Traceback (most recent call last)—which tells it that you expect an exception, and the part after the traceback, which tells it which exception you expect. Doctest only reports a failure if one of these parts does not match.

That's helpful for a second reason as well: manually figuring out what the traceback would look like, when you're writing your tests would require a significant amount of effort, and would gain you nothing. It's better to simply omit them.

Time for action – expecting an exception

This is yet another test that you can add to test.txt, this time testing some code that ought to raise an exception.

  1. Insert the following text into your doctest file (Please note that the last line of this text has been wrapped due to the constraints of the article's format, and should be a single line):
      Here we use doctest's exception syntax to check that Python is
      correctly enforcing its grammar.
      >>> def faulty():
      ... yield 5
      ... return 7
      Traceback (most recent call last):
      SyntaxError: 'return' with argument inside generator
      (<doctest test.txt[5]>, line 3)
  2. The test is supposed to raise an exception, so it will fail if it doesn't raise the exception, or if it raises the wrong exception. Make sure you have your mind wrapped around that: if the test code executes successfully, the test fails, because it expected an exception.
  3. Run the tests using doctest and the following screen will be displayed:

What just happened?

Since Python doesn't allow a function to contain both yield statements and return statements with values, having the test to define such a function caused an exception. In this case, the exception was a SyntaxError with the expected value. As a result, doctest considered it a match with the expected output, and thus the test passed. When dealing with exceptions, it is often desirable to be able to use a wildcard matching mechanism. Doctest provides this facility through its ellipsis directive, which we'll discuss later

Expecting blank lines in the output

Doctest uses the first blank line to identify the end of the expected output. So what do you do, when the expected output actually contains a blank line?

Doctest handles this situation by matching a line that contains only the text <BLANKLINE> in the expected output, with a real blank line in the actual output.

Using directives to control doctest

Sometimes, the default behavior of doctest makes writing a particular test inconvenient. That's where doctest directives come to our rescue. Directives are specially formatted comments that you place after the source code of a test, which tell doctest to alter its default behavior in some way.

A directive comment begins with # doctest:, after which comes a comma-separated list of options, that either enable or disable various behaviors. To enable a behavior, write a + (plus symbol) followed by the behavior name. To disable a behavior, white a (minus symbol) followed by the behavior name.

Ignoring part of the result

It's fairly common that only part of the output of a test is actually relevant to determining whether the test passes. By using the +ELLIPSIS directive, you can make doctest treat the text ... (called an ellipsis) in the expected output as a wildcard, which will match any text in the output.

When you use an ellipsis, doctest will scan ahead until it finds text matching whatever comes after the ellipsis in the expected output, and continue matching from there. This can lead to surprising results such as an ellipsis matching against a 0-length section of the actual output, or against multiple lines. For this reason, it needs to be used thoughtfully.

Time for action – using ellipsis in tests

We'll use the ellipsis in a few different tests, to get a better feel for what it does and how to use it.

  1. Insert the following text into your doctest file:
      Next up, we're exploring the ellipsis.
      >>> sys.modules # doctest: +ELLIPSIS
      {...'sys': <module 'sys' (built-in)>...}
      >>> 'This is an expression that evaluates to a string'
      ... # doctest: +ELLIPSIS
      'This is ... a string'
      >>> 'This is also a string' # doctest: +ELLIPSIS
      'This is ... a string'
      >>> import datetime
      >>> # doctest: +ELLIPSIS

  2. Run the tests using doctest and the following screen is displayed:
  3. None of these tests would pass without the ellipsis. Think about that, and then try making some changes and see if they produce the results you expect.

What just happened?

We just saw how to enable ellipsis matching. In addition, we saw a couple of variations on where the doctest directive comment can be placed, including on a block continuation line by itself.

We got a chance to play with the ellipsis a little bit, and hopefully saw why it should be used carefully. Look at that last test. Can you imagine any output that wasn't an ISO-formatted time stamp, but that it would match anyway?

Ignoring whitespace

Sometimes, whitespace (spaces, tabs, newlines, and their ilk) are more trouble than they're worth. Maybe you want to be able to break a single line of expected output across several lines in your test file, or maybe you're testing a system that uses lots of whitespace but doesn't convey any useful information with it.

Doctest gives you a way to "normalize" whitespace, turning any sequence of whitespace characters, in both the expected output and in the actual output, into a single space. It then checks whether these normalized versions match.

Time for action – normalizing whitespace

We'll write a couple of tests that demonstrate how whitespace normalization works.

  1. Insert the following text into your doctest file:
      Next, a demonstration of whitespace normalization.
      >>> [1, 2, 3, 4, 5, 6, 7, 8, 9]
      ... # doctest: +NORMALIZE_WHITESPACE
      [1, 2, 3,
      4, 5, 6,
      7, 8, 9]
      >>> sys.stdout.write("This textn contains weird spacing.")
      ... # doctest: +NORMALIZE_WHITESPACE
      This text contains weird spacing.

  2. Run the tests using doctest and the following screen is displayed:
  3. Notice how one of the tests inserts extra whitespace in the expected output, while the other one ignores extra whitespace in the actual output. When you use +NORMALIZE_WHITESPACE, you gain a lot of flexibility with regard to how things are formatted in the text file.

Skipping an example entirely

On some occasions, doctest would recognize some text as an example to be checked, when in truth you want it to be simply text. This situation is rarer than it might at first seem, because usually there's no harm in letting doctest check everything it can. In fact, it is usually helpful to have doctest check everything it can. For those times when you want to limit what doctest checks, though, there's the +SKIP directive.

Time for action – skipping tests

This is an example of how to skip a test:

  1. Insert the following text into your doctest file:
      Now we're telling doctest to skip a test
      >>> 'This test would fail.' # doctest: +SKIP
      If it were allowed to run.
  2. Run the tests using doctest and the following screen will be displayed:
  3. Notice that the test didn't fail, and that the number of tests that were run did not change.

What just happened

The skip directive transformed what would have been a test, into plain text(as far as doctest is concerned). Doctest never ran the test, and in fact never counted it as a test at all.

There are several situations where skipping a test might be a good idea. Sometimes, you have a test which doesn't pass (which you know doesn't pass), but which simply isn't something that should be addressed at the moment. Using the skip directive lets you ignore the test for a while. Sometimes, you have a section of human readable text that looks like a test to the doctest parser, even though it's really only for human consumption. The skip directive can be used to mark that code as not for actual testing.

Other doctest directives

There are a number of other directives that can be issued to adjust the behavior of doctest. They are fully documented at, but here is a quick overview:

  • +DONT_ACCEPT_TRUE_FOR_1, which makes doctest treat True and 1 as different values, instead of treating them as matching as it normally does. +DONT_ACCEPT_BLANKLINE, which makes doctest forget about the special meaning of <BLANKLINE>.
  • +IGNORE_EXCEPTION_DETAIL, which makes doctest treat exceptions as matches if the exception type is the same, regardless of whether the rest of the exception matches.
  • +REPORT_UDIFF, which makes doctest use unified diff format when it displays a failed test. This is useful if you are used to reading the unified diff format, which is by far the most common diff format within the open source community.
  • +REPORT_CDIFF, which makes doctest use context diff format when it displays a failed test. This is useful if you are used to reading the context diff format.
  • +REPORT_NDIFF, which makes doctest use ndiff format when it displays a failed test. This is useful if you are used to reading the ndiff format.
  • +REPORT_ONLY_FIRST_FAILURE makes doctest avoid printing out failure reports on those tests after it is applied, if a failure report has already been printed. The tests are still executed, and doctest still keeps track whether they failed or not. Only the report is changed by using this flag.

Execution scope

When doctest is running the tests from text files, all the tests from the same file are run in the same execution scope. That means that if you import a module or bind a variable in one test, that module or variable is still available in later tests. We took advantage of this fact several times in the tests written so far in this article: the sys module was only imported once, for example, although it was used in several tests.

That behavior is not necessarily beneficial, because tests need to be isolated from each other. We don't want them to contaminate each other, because if a test depends on something that another test does, or if it fails because of something that another test does, those two tests are in some sense turned into one test that covers a larger section of your code. You don't want that to happen, because knowing which test has failed doesn't give you as much information about what went wrong and where it happened.

So, how can we give each test its own execution scope? There are a few ways to do it. One would be to simply place each test in its own file, along with whatever explanatory text that is needed. This works beautifully, but running the tests can be a pain unless you have a tool to find and run all of them. We'll talk about one such tool (called nose) later.

Another way to give each test its own execution scope, is to define each test within a function, as follows:

>>> def test1():
... import frob
... return frob.hash('qux')
>>> test1()

By doing that, the only thing that ends up in the shared scope is the test function (named test1 here). The frob module, and any other names bound inside the function, are isolated.

The third way is to exercise caution with the names you create, and be sure to set them to known values at the beginning of each test section. In many ways this is the easiest approach, but it's also the one that places the most burden on you, because you have to keep track of what's in the scope.

Why does doctest behave this way, instead of isolating tests from each other? Doctest files are intended not just for computers to read, but also for humans. They often form a sort of narrative, flowing from one thing to the next. It would break the narrative to be constantly repeati ng what came before. In other words, this approach is a compromise between being a document and being a test framework, a middle ground that works for both humans and computers.

The other framework that we study in depth (called simply unit test) works at a more formal level, and enforces the separation between tests.


We learned the syntax of doctest, and went through several examples describing how to use it.

Specifically, we covered doctest's default syntax, and the directives that alter it.


If you have read this article you may be interested to view :

You've been reading an excerpt of:

Python Testing: Beginner's Guide

Explore Title