October 2009

# Plotting dates

Sooner or later, we all have had the need to plot some information over time, be it for the bank account balance each month, the total web site accesses for each day of the year, or one of many other reasons.

Matplotlib has a plotting function ad hoc for dates, plot_date() that considers data on X, Y, or both axes, as dates, labeling the axis accordingly.

As usual, we now present an example, and we will discuss it later:

In [1]: import matplotlib as mpl
In [2]: import matplotlib.pyplot as plt
In [3]: import numpy as np
In [4]: import datetime as dt
In [5]: dates = [dt.datetime.today() + dt.timedelta(days=i)
...: for i in range(10)]
In [6]: values = np.random.rand(len(dates))
In [7]: plt.plot_date(mpl.dates.date2num(dates), values, linestyle='-
');
In [8]: plt.show()

First, a note about linestyle keyword argument: without it, there's no line connecting the markers that are displayed alone.

We  created  the dates array using timedelta(), a datetime function that helps us define a date interval—10 days in this case. Note how we had to convert our date values using the date2num() function. This is because Matplotlib represents dates as float values corresponding to the number of days since 0001-01-01 UTC.

Also note how the X-axis labels, the ones that have data values, are badly rendered.

Matplotlib provides ways to address the previous two points—date formatting and conversion, and axes formatting.

## Date formatting

Commonly, in Python programs, dates are represented as datetime objects, so we have to first convert other data values into datetime objects, sometimes by using the dateutil companion module, for example:

import datetime
date = datetime.datetime(2009, 03, 28, 11, 34, 59, 12345)

or

import dateutil.parser
datestrings = ['2008-07-18 14:36:53.494013','2008-07-20
14:37:01.508990', '2008-07-28 14:49:26.183256']
dates = [dateutil.parser.parse(s) for s in datestrings]

Once we have the datetime objects, in order to let Matplotlib use them, we have to convert them into floating point numbers that represent the number of days since 0001-01-01 00:00:00 UTC.

To do that, Matplotlib itself provides several helper functions contained in the matplotlib.dates module:

• date2num():  This function converts one or a sequence of datetime objects to float values representing days since 0001-01-01 00:00:00 UTC (the fractional parts represent hours, minutes, and seconds)
• num2date():  This function converts one or a sequence of float values representing days since 0001-01-01 00:00:00 UTC to datetime objects (or a sequence, if the input is a sequence)
• drange(dstart, dend, delta): This function returns a date range (a sequence) of float values in Matplotlib date format; dstart and dend are datetime objects while delta is a datetime.timedelta instance

Usually, what we will end up doing is converting a sequence of datetime objects into a Matplotlib representation, such as:

dates = list of datetime objects
mpl_dates = matplotlib.dates.date2num(dates)

drange() can be useful in situations like this one:

import matplotlib as mpl
from matplotlib import dates
import datetime as dt
date1 = dt.datetime(2008, 9, 23)
date2 = dt.datetime(2009, 4, 12)
delta = dt.timedelta(days=10)
dates = mpl.dates.drange(date1, date2, delta)

where dates will be a sequence of floats starting from date1 and ending at date2 with a delta timestamp between each item of the list.

## Axes formatting with axes tick locators and formatters

As we have already seen, the X labels on the first image are not that nice looking. We would expect Matplotlib to allow a better way to label the axis, and indeed, there is.

The solution is to change the two parts that form the axis   ticks—locators and formatters. Locators control the tick's position, while formatters control the formatting of labels. Both have a major and minor mode: the major locator and formatter are active by default and are the ones we commonly see, while minor mode can be turned on by passing a relative locator or formatter function (because minors are turned off by default by assigning NullLocator and NullFormatter to them).

While this is a general tuning operation and can be applied to all Matplotlib plots, there are some specific locators and formatters for date plotting, provided by matplotlib.dates:

• MinuteLocator, HourLocator,DayLocator, WeekdayLocator,MonthLocator, YearLocator are all the  locators available that place a tick at the time specified by the name, for example, DayLocator will draw a tick at each day. Of course, a minimum knowledge of the date interval that we are about to draw is needed to select the best locator.
• DateFormatter is the tick formatter that uses strftime() to format strings.

The default locator and formatter are matplotlib.ticker.AutoDateLocator and matplotlib.ticker.AutoDateFormatter, respectively. Both are set by the plot_date() function when called. So, if you wish to set a different locator and/or formatter, then we suggest to do that after the plot_date() call in order to avoid the plot_date() function resetting them to the default values.

Let's group all this up in an example:

In [1]: import matplotlib as mpl
In [2]: import matplotlib.pyplot as plt
In [3]: import numpy as np
In [4]: import datetime as dt
In [5]: fig = plt.figure()
In [6]: ax2 = fig.add_subplot(212)
In [7]: date2_1 = dt.datetime(2008, 9, 23)
In [8]: date2_2 = dt.datetime(2008, 10, 3)
In [9]: delta2 = dt.timedelta(days=1)
In [10]: dates2 = mpl.dates.drange(date2_1, date2_2, delta2)
In [11]: y2 = np.random.rand(len(dates2))
In [12]: ax2.plot_date(dates2, y2, linestyle='-');
In [13]: dateFmt = mpl.dates.DateFormatter('%Y-%m-%d')
In [14]: ax2.xaxis.set_major_formatter(dateFmt)
In [15]: daysLoc = mpl.dates.DayLocator()
In [16]: hoursLoc = mpl.dates.HourLocator(interval=6)
In [17]: ax2.xaxis.set_major_locator(daysLoc)
In [18]: ax2.xaxis.set_minor_locator(hoursLoc)
In [19]: fig.autofmt_xdate(bottom=0.18) # adjust for date labels
display
In [21]: ax1 = fig.add_subplot(211)
In [22]: date1_1 = dt.datetime(2008, 9, 23)
In [23]: date1_2 = dt.datetime(2009, 2, 16)
In [24]: delta1 = dt.timedelta(days=10)
In [25]: dates1 = mpl.dates.drange(date1_1, date1_2, delta1)
In [26]: y1 = np.random.rand(len(dates1))
In [27]: ax1.plot_date(dates1, y1, linestyle='-');
In [28]: monthsLoc = mpl.dates.MonthLocator()
In [29]: weeksLoc = mpl.dates.WeekdayLocator()
In [30]: ax1.xaxis.set_major_locator(monthsLoc)
In [31]: ax1.xaxis.set_minor_locator(weeksLoc)
In [32]: monthsFmt = mpl.dates.DateFormatter('%b')
In [33]: ax1.xaxis.set_major_formatter(monthsFmt)
In [34]: plt.show()

The result of executing the previous code snippet is as shown:

We drew the subplots in reverse order to avoid some minor overlapping problems.

fig.autofmt_xdate() is used to nicely format date tick labels. In particular, this function rotates the labels (by using rotation keyword argument, with a default value of 30°) and gives them  more room (by using bottom keyword argument, with a default value of 0.2).

We can achieve the same result, at least for the additional spacing, with:

fig = plt.figure()

This can also be done by creating the Axes instance directly with:

ax = fig.add_axes([left, bottom, width, height])

while specifying the explicit dimensions.

The subplots_adjust() function allows us to control the spacing around the subplots by using the following keyword arguments:

• bottom, top, left, right: Controls the spacing at the bottom, top, left, and right of the subplot(s)
•

• wspace, hspace: Controls the horizontal and vertical spacing between subplots

We can also control the spacing by using these parameters in the Matplotlib configuration file:

figure.subplot.<position> = <value>

## Custom formatters and locators

Even if it's not strictly related to date plotting, tick formatters allow for custom formatters too:

...
import matplotlib.ticker as ticker
...
def format_func(x, pos):
return <a transformation on x>
...
formatter = ticker.FuncFormatter(format_func)
ax.xaxis.set_major_formatter(formatter)
...

The  function format_func will be called for each label to draw, passing its value and position on the axis. With those two arguments, we can apply a transformation (for example, divide x by 10) and then return a value that will be used to actually draw the tick label.

Here's a general note on NullLocator: it can be used to remove axis ticks by simply issuing:

ax.xaxis.set_major_locator(matplotlib.ticker.NullLocator())

# Text properties, fonts, and LaTeX

Matplotlib has excellent text support, including mathematical expressions, TrueType font support for raster and vector outputs, newline separated text with arbitrary rotations, and Unicode.

We have total control over every text property (font size, font weight, text location, color, and so on) with sensible defaults set in the rc configuration file. Specifically for those interested in mathematical or scientific figures, Matplotlib implements a large number of TeX math symbols and commands to support mathematical expressions anywhere in the figure.

We already saw some text functions, but the following list contains all the functions which can be used to insert text with the pyplot interface, presented along with the corresponding API method and a description:

 Pyplot function API method Description text() mpl.axes.Axes.text() Adds text at an arbitrary location to the Axes xlabel() mpl.axes.Axes.set_xlabel() Adds an axis label to the X-axis ylabel() mpl.axes.Axes.set_ylabel() Adds an axis label to the Y-axis title() mpl.axes.Axes.set_title() Adds a title to the Axes figtext() mpl.figure.Figure.text() Adds text at an arbitrary location to the Figure suptitle() mpl.figure.Figure.suptitle() Adds a centered title to the Figure annotate() mpl.axes.Axes.annotate() Adds an annotation with an optional arrow to the Axes

All of these commands return a matplotlib.text.Text instance. We can customize the text properties by passing keyword arguments to the functions or by using matplotlib.artist.setp():

t = plt.xlabel('some text', fontsize=16, color='green')

We can do it as:

t = plt.xlabel('some text')
plt.setp(t, fontsize=16, color='green')

Handling objects allows for several new possibilities; such as setting the same property to all the objects in a specific group. Matplotlib has several convenience functions to return the objects of a plot. Let's take the example of the tick labels:

ax.get_xticklabels()

This line of code returns a sequence of object instances (the labels for the X-axis ticks) that we can tune:

for t in ax.get_xticklabels():
t.set_fontsize(5.)

or else, still using setp():

setp(ax.get_xticklabels(), fontsize=5.)

It can take a sequence of objects, and apply the same property to all of them.

To recap, all of the properties such as color, fontsize, position, rotation, and so on, can be set either:

• At function call using keyword arguments
• Using setp() referencing the Text instance
• Using the modification function

## Fonts

Where there is text, there are also fonts to draw it. Matplotlib allows for several font customizations.

The most complete documentation on this is currently available in the Matplotlib configuration file, /etc/matplotlibrc. We are now reporting that information here.

There are six font properties available for modification.

 Property name Values and description font.family It has five values: serif (example, Times) sans-serif (example, Helvetica) cursive (example, Zapf-Chancery) fantasy (example, Western) monospace (example, Courier) Each of these font families has a default list of font names in decreasing order of priority associated with them (next table). In addition to these generic font names, font.family may also be an explicit name of a font available on the system. font.style Three values: normal (or roman), italic, or oblique. The oblique style will be used for italic, if it is not present. font.variant Two values: normal or small-caps. For TrueType fonts, which are scalable fonts, small-caps is equivalent to using a font size of smaller, or about 83% of the current font size. font.weight Effectively has 13 values-normal, bold, bolder, lighter, 100, 200, 300, ..., 900. normal is the same as 400, and bold is 700. bolder and lighter are relative values with respect to the current weight. font.stretch 11 values-ultra-condensed, extra-condensed, condensed, semi-condensed, normal, semi-expanded, expanded, extra-expanded, ultra-expanded, wider, and narrower. This property is not currently implemented. It works if the font supports it, but only few do. font.size The default font size for text, given in points. 12pt is the standard value.

The list of font names, selected by font.family, in the priority search order is:

 Property name Font list font.serif Bitstream Vera Serif, New Century Schoolbook, Century Schoolbook L, Utopia, ITC Bookman, Bookman, Nimbus Roman No9 L, Times New Roman, Times, Palatino, Charter, serif font.sans-serif Bitstream Vera Sans, Lucida Grande, Verdana, Geneva, Lucid, Arial, Helvetica, Avant Garde, sans-serif font.cursive Apple Chancery, Textile, Zapf Chancery, Sand, cursive font.fantasy Comic Sans MS, Chicago, Charcoal, Impact, Western, fantasy font.monospace Bitstream Vera Sans Mono, Andale Mono, Nimbus Mono L, Courier New, Courier, Fixed, Terminal, monospace

The first valid and available (that is, installed) font in each family is the one that will be loaded. If the fonts are not specified, the Bitstream Vera Sans fonts are used by default.

As usual, we can set these values in the configuration file or in the code accessing the rcParams dictionary provided by Matplotlib.

# Using Latex formatting

If you have ever used LaTeX, you know how powerful it can be at rendering mathematical expressions. Given its root in the scientific field, Matplotlib allows us to embed TeX text in its plots. There are two ways available

• Mathtext
• Using an external TeX renderer

## Mathtext

Matplotlib includes an internal engine to render TeX expression, mathtext. The mathtext module provides TeX style mathematical expressions using FreeType 2 and the default font from TeX, Computer Modern.

As Matplotlib ships with everything it needs to make mathtext work, there is no requirement to install a TeX system (or any other external program) on the computer for it to be used.

The markup character used to signal the start and the end of a mathtext string is \$;encapsulating a string inside a pair of \$ characters will trigger the mathtext engine to render it as a TeX mathematical expression.

We should use raw strings (preceding the quotes with an r character) and surround the mathtext with dollar signs (\$), as in TeX. The use of raw strings is important so that backslashes (used for TeX symbols escaping) are not mangled by the Python interpreter.

Matplotlib accepts TeX equations in any text expressions, so regular text and mathtext can be interleaved within the same string.

An example of the kind of text we can generate is:

In [1]: import matplotlib.pyplot as plt
In [2]: fig = plt.figure()
In [3]: ax= fig.add_subplot(111)
In [4]: ax.set_xlim([1, 6]);
In [5]: ax.set_ylim([1, 9]);
In [6]: ax.text(2, 8, r"\$ mu alpha tau pi lambda omega tau
lambda iota beta \$");
In [7]: ax.text(2, 6, r"\$ lim_{x rightarrow 0} frac{1}{x} \$");
In [8]: ax.text(2, 4, r"\$ a leq b leq c Rightarrow a
leq c\$");
In [9]: ax.text(2, 2, r"\$ sum_{i=1}^{infty} x_i^2\$");
In [10]: ax.text(4, 8, r"\$ sin(0) = cos(frac{pi}{2})\$");
In [11]: ax.text(4, 6, r"\$ sqrt[3]{x} = sqrt{y}\$");
In [12]: ax.text(4, 4, r"\$ neg (a wedge b) Leftrightarrow neg a
vee neg b\$");
In [13]: ax.text(4, 2, r"\$ int_a^b f(x)dx\$");
In [14]: plt.show()

The preceding code snippet results in the following:

The escape sequence is almost the same as that of LaTeX; consult the Matplotlib and/or LaTeX online documentation to see the full list.

## External TeX renderer

Matplotlib also allows to manage all the text layout using an external LaTeX engine. This is limited to Agg, PS, and PDF backends and is commonly needed when we want to create graphs to be embedded into LaTeX documents, where rendering uniformity is really pleasant.

To activate an external TeX  rendering engine for text strings, we need to set this parameter in the configuration file:

text.usetex : True

or use the rcParams dictionary:

rcParams['text.usetex'] = True

This mode requires LaTeX, dvipng, and Ghostscript to be correctly installed and working. Also note that usually external TeX management is slower than Matplotlib's mathtext and that all the texts in the figure are drawn using the external renderer, not only the mathematical ones.

There are several optimizations and configurations that you will need to do when dealing with TeX, postscripts and so, we invite you to consult an official documentation for additional information.

When the previous example is executed and rendered using an external LaTeX engine, the result is:

Also, look at how the tick label's text is rendered in the same font as the text in the figure, as in this real world example:

In [1]: import matplotlib as mpl
In [2]: import matplotlib.pyplot as plt
In [3]: mpl.rcParams['text.usetex'] = True
In [4]: import numpy as np
In [5]: x = np.arange(0., 5., .01)
In [6]: y = [np.sin(2*np.pi*xx) * np.exp(-xx) for xx in x]
In [7]: plt.plot(x, y, label=r'\$sin(2pi x)exp(-x)\$');
In [8]: plt.plot(x, np.exp(-x), label=r'\$exp(-x)\$');
In [9]: plt.plot(x, -np.exp(-x), label=r'\$-exp(-x)\$');
In [10]: plt.title(r'\$sin(2pi x)exp(-x)\$ with the two asymptotes
\$pmexp(-x)\$');
In [11]: plt.legend();
In [12]: plt.show()

The preceding code snippet results in a sinusoidal line contained in two asymptotes:

# Contour plots and image plotting

We will now discuss the features Matplotlib provides to create contour plots and display images.

## Contour plots

Contour lines (also known as level lines or isolines) for a function of two variables are curves where the function has constant values. Mathematically speaking, it's a graph image that shows:

f(x, y) = L

with L constant. Contour lines often have specific names beginning with iso- (from Greek, meaning equal) according to the nature of the variables being mapped.

There are a lot of applications of contour lines in several fields such as meteorology (for temperature, pressure, rain precipitation, wind speed), geography, oceanography, cartography (elevation and depth), magnetism, engineering, social sciences, and so on.

The absolutely most common examples of contour lines are those seen in weather forecasts, where lines of isobars (where the atmospheric pressure is constant) are drawn over the terrain maps. In particular, those are contour maps because contour lines are drawn above a map in order to add specific information to it.

The density of the lines indicates the slope of the function. The gradient of the function is always perpendicular to the contour lines, and when the lines are close together, the length of the gradient is large and the variation is steep.

Here is a contour plot from a random number matrix:

In [1]: import matplotlib.pyplot as plt
In [2]: import numpy as np
In [3]: matr = np.random.rand(21, 31)
In [4]: cs = plt.contour(matr)
In [5]: plt.show()

where the contour lines are colored from blue to red in a scale from the lower to the higher values.

The contour() function draws contour lines, taking a 2D array as input (a list of list notations). In this case, it's a matrix of 21x31 random elements. The number of level lines to draw is chosen automatically, but we can also specify it as an additional parameter, N:

contour(matrix, N)

The previous line of code tells us to draw N automatically chosen level lines.

There is also a similar function that draws a filled contours plot, contourf()

In [6]: csf = plt.contourf(matr)
In [7]: plt.colorbar();
In [8]: plt.show()

contourf() fills the spaces between the contours lines with the same color progression used in the contour() plot: dark blue is used for low value areas, while red is used for high value areas, fading in between for the intermediate values.

Contour colors can be changed using a colormap, a set of colors used as a lookup table by Matplotlib when it needs to select more colors, specified using the cmap keyword argument.

We also added a colorbar() call to draw a color bar next to the plot to identify the ranges the colors are assigned to. In this case, there are a few bins where the values can fall because rand() NumPy function returns values between 0 and 1.

Labeling the level lines is important in order to provide information about what levels were chosen for display; clabel() does this by taking as input a contour instance, as returned by a previous contour() call:

In [1]: import matplotlib.pyplot as plt
In [2]: import numpy as np
In [3]: x = np.arange(-2, 2, 0.01)
In [4]: y = np.arange(-2, 2, 0.01)
In [5]: X, Y = np.meshgrid(x, y)
In [6]: ellipses = X*X/9 + Y*Y/4 - 1
In [7]: cs = plt.contour(ellipses)
In [8]: plt.clabel(cs);
In [9]: plt.show()

Here, we draw several ellipses and then call clabel() to display the selected levels. We used the NumPy meshgrid() function to get the coordinate matrices, X and Y, from the two coordinate vectors, x and y. The output of this code is shown in the following image:

# Image plotting

Matplotlib also has basic image plotting capabilities provided by the functions: imread() and imshow().

imread() reads an image from a file and converts it into a NumPy array; imshow() takes an array as input and displays it on the screen:

import matplotlib.pyplot as plt
plt.imshow(f)

Matplotlib can only read PNG files natively, but if the Python Imaging Library (usually known as PIL) is installed, then this library will be used to read the image and return an array (if possible).

Note that when working with images, the origin is in the upper-left corner. This can be changed using the origin keyword argument, origin='lower' (which is the only other acceptable value, in addition to the default 'upper'), which will set the origin on the lower-left corner. We can also set it as a configuration parameter, and the key name is image.origin.

Just note that once the image is an array, we can do all the transformations we like. imshow() can plot any 2D sets of data and not just the ones read from image files. For example, let's take the ellipses code we used for contour plot and see what imshow() draws.

In [1]: import matplotlib.pyplot as plt
In [2]: import numpy as np
In [3]: x = np.arange(-2, 2, 0.01)
In [4]: y = np.arange(-2, 2, 0.01)
In [5]: X, Y = np.meshgrid(x, y)
In [6]: ellipses = X*X/9 + Y*Y/4 - 1
In [7]: plt.imshow(ellipses);
In [8]: plt.colorbar();
In [9]: plt.show()

This example creates a full spectrum of colors starting from deep blue of the image center, slightly turning into green, yellow, and red near the image corners:

# Summary

We've come a long way even in this article, so let's recap the arguments we touched:

• Object-oriented interfaces and the relationship with pyplot and pylab
• How to draw subplots and multiple figures
• How to manipulate axes so that they can be shared between subplots or can be shared between two plots
• Logarithmic scaled axes
• How to plot dates, and tune tick formatters and locators
• Text properties, fonts, and LaTeX typewriting both with the internal engine mathtext and with an external renderer
• Contour plots and image plotting

With the information that we have gathered so far, we are ready to extract Matplotlib from a pure script or interactive usage inside the Python interpreter and learn how we can embed this library in a GUI Python application.

[ 1 | 2 ]

If you have read this article you may be interested to view :

You've been reading an excerpt of: