What do you get with Print?

Instant access to your digital copy whilst your Print order is Shipped

Paperback book shipped to your preferred address

Redeem a companion digital copy on all Print orders

Access this title in our online reader with advanced features

DRM FREE - Read whenever, wherever and however you want

Mastering Matplotlib

Chapter 1. Getting Up to Speed

Over the past 12 years of its existence, matplotlib has made its way into the classrooms, labs, and hearts of the scientific computing world. With Python's rise in popularity for serious professional and academic work, matplotlib has taken a respected seat beside long-standing giants such as Mathematica by Wolfram Research and MathWorks' MATLAB products. As such, we feel that the time is ripe for an advanced text on matplotlib that guides its more sophisticated users into new territory by not only allowing them to become experts in their own right, but also providing a clear path that will help them apply their new knowledge in a number of environments.

As a part of a master class series by Packt Publishing, this book focuses almost entirely on a select few of the most requested advanced topics in the world of matplotlib, which includes everything from matplotlib internals to high-performance computing environments. In order to best support this, we want to make sure that our readers have a chance to prepare for the material of this book, so we will start off gently.

The topics covered in this chapter include the following:

A brief historical overview of matplotlib
What's new in matplotlib
Who is an advanced, beginner, or an intermediate matplotlib user
The software dependencies for many of the book's examples
An overview of Python 3
An overview of the coding style used in this book
References for installation-related instructions
A refresher on IPython Notebooks
A teaser of a complicated plot in matplotlib
Additional resources to obtain advanced beginner and intermediate matplotlib knowledge

Prerequisites for this book

This book assumes that you have previous experience with matplotlib and that it has been installed on your preferred development platform. If you need a refresher on the steps to accomplish that, the first chapter of Sandro Tosi's excellent book, Matplotlib for Python Developers, provides instructions to install matplotlib and its dependencies.

In addition to matplotlib, you will need a recent installation of IPython to run many of the examples and exercises provided. For help in getting started with IPython, there many great resources available on the project's site. Cyrille Rossant has authored Learning IPython for Interactive Computing and Data Visualization, Packt Publishing, which is a great resource as well.

In the course of this book, we will install, configure, and use additional open source libraries and frameworks. We will cover the setup of these as we get to them, but all the programs in this book will require you to have the following installed on your machine:

Git
GNU make
GNU Compiler Collection (gcc)

Your operating system's package manager should have a package that installs common developer tools—these tools should be installed as well, and may provide most of the tools automatically.

All the examples in this book will be implemented using a recent release of Python, version 3.4.2. Many of the examples will not work with the older versions of Python, so please note this carefully. In particular, the setup of virtual environments uses a feature that is new in Python 3.4.2, and some examples use the new type annotations. At the time of writing this book, the latest version of Ubuntu ships with Python 3.4.2.

Though matplotlib, NumPy, IPython, and the other libraries will be installed for you by set scripts provided in the code repositories for each chapter. For the sake of clarity, we will mention the versions used for some of these here:

matplotlib 1.4.3
NumPy 1.9.2
SciPy 0.15.1
IPython 3.1.0 (also known as Jupyter)

Python 3

On this note, it's probably good to discuss Python 3 briefly as there has been continued debate on the choice between the two most recent versions of the programming language (the other being the 2.7.x series). Python 3 represents a massive community-wide effort to adopt better coding practices as well as improvements in the maintenance of long-lived libraries, frameworks, and applications. The primary impetus and on-going strength of this effort, though, is a general overhaul of the mechanisms underlying Python itself. This will ultimately allow the Python programming language greater maintainability and longevity in the coming years, not to mention better support for the ongoing performance enhancements.

In case you are new to Python 3, the following table, which compares some of the major syntactical differences between Python 2 and Python 3, has been provided:

Syntactical Differences	Python 2	Python 3
Division with floats	`x = 15 / 3.0`	`x = 15 / 3`
Division with truncation	`x = 15 / 4`	`x = 15 // 4`
Longs	`y = long(x * 10)`	`y = int(x * 10)`
Not equal	`x <> y`	`x != y`
The unicode function	`u = unicode(s)`	`u = str(s)`
Raw unicode	`u = ur"\t\s"`	`u = r"\t\s"`
Printing	`print x, y, z`	`print(x, y, z)`
Raw user input	`y = raw_input(x)`	`y = input(x)`
User input	`y = input(x)`	`y = eval(input(x))`
Formatting	`"%d %s" % (n, s)`	`"{} {}".format(n,s)`
Representation	`'x'`	`repr(x)`
Function application	`apply(fn, args)`	`fn(*args)`
Filter	`itertools.ifilter`	`filter`
Map	`itertools.imap`	`map`
Zip	`itertools.izip`	`zip`
Range	`xrange`	`range`
Reduce	`reduce`	`functools.reduce`
Iteration	`iterator.next()`	`next(iterator)`
The execute code	`exec code`	`exec(code)`
The execute file	`execfile(file)`	`exec(fh.read())`
Exceptions	`try:` `...` `except val, err:` `...`	`try:` `...` `except val as err:` `...`

Using IPython Notebooks with matplotlib

Python virtual environments are the recommended way of working with Python projects. They keep your system, Python, and default libraries safe from disruption. We will continue this tradition in this book, but you are welcome to transcend tradition and utilize the matplotlib library and the provided code in whatever way you see fit.

Using the native venv Python environment management package, each project may define its own versions of dependent libraries, including those of matplotlib and IPython. The sample code for this book does just that—listing the dependencies in one or more requirements.txt files.

With the addition of the nbagg IPython Notebook backend to matplotlib in version 1.4, users can now work with plots in a browser very much like they've been able to do in the GTK and Qt apps on the desktop. We will take full advantage of this new feature.

In the IPython examples of this book, most of the notebooks will start off with the following:

In [1]: import matplotlib matplotlib.use('nbagg')
In [2]: %matplotlib inline
In [3]: import matplotlib.pyplot as plt

Tip

Downloading the example code

Each chapter in Mastering matplotlib provides instructions on obtaining the example code and notebook from Github. A master list has been provided at https://github.com/masteringmatplotlib/notebooks. You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you." This configures our notebooks to use matplotlib in the way that we need. The example in the following section starts off with just those commands.

A final note about IPython—the project has recently changed its name to Jupyter in an effort to embrace the language-agnostic growth the project and community has experienced as well as the architectural changes that will make the adding of new language backends much easier. The user experience will not change (except for the better), but you will notice a different name and logo when you open the chapter notebooks for this book.

Setting up the interactive backend

As mentioned above, our notebooks will all start with the following, as does this preview notebook:

In [1]: import matplotlib
        matplotlib.use('nbagg')
        %matplotlib inline
In [2]: import matplotlib.pyplot as plt
        import seaborn as sns
        import numpy as np
        from scipy import stats
        import pandas as pd

These commands do the following:

Set up the interactive backend for plotting
Allow us to evaluate images in-line, as opposed doing the same in a pop-up window
Provide the standard alias to the matplotlib.pyplot sub package and import other packages that we will need

Joint plots with Seaborn

Our first preview example will take a look at the Seaborn package, an open source third-party library for data visualization and attractive statistical graphs. Seaborn depends upon not only matplotlib, but also NumPy and SciPy (among others). These were already installed for you when you ran make (pulled from the requirements.txt file).

We'll cover Seaborn palettes in more detail later in the book, so the following command is just a sample. Let's use a predefined palette with a moderate color saturation level:

In [3]: sns.set_palette("BuPu_d", desat=0.6)
        sns.set_context("notebook", font_scale=2.0)

Next, we'll generate two sets of random data (with a random seed of our choosing), one for the x axis and the other for the y axis. We're then going to plot the overlap of these distributions in a hex plot. Here are the commands for the same:

In [4]: np.random.seed(42424242)
In [5]: x = stats.gamma(5).rvs(420)
        y = stats.gamma(13).rvs(420)
In [6]: with sns.axes_style("white"):
            sns.jointplot(x, y, kind="hex", size=16);

The generated graph is as follows:

Scatter plot matrix graphs with Pandas

In the second preview, we will use Pandas to graph a matrix of scatter plots whose diagonal will be the statistical graphs representing the kernel density estimation. We're going to go easy on the details for now; this is just to whet your appetite for more!

Pandas is a statistical data analysis library for Python that provides high-performance data structures, allowing one to carry out an entire scientific computing workflow in Python (as opposed to having to switch to something like R or Fortran for parts of it).

Let's take the seven columns (inclusive) from the baseball.csv data file between Runs (r) and Stolen Bases (sb) for players between the years of 1871 and 2007 and look at them at the same time in one graph:

In [7]: baseball = pd.read_csv("../data/baseball.csv")
In [8]: plt.style.use('../styles/custom.mplstyle')
        data = pd.scatter_matrix(
             baseball.loc[:,'r':'sb'],
             figsize=(16,10))

The generated graph is as follows:

Command 8 will take a few seconds longer than our previous plot since it's crunching a lot of data.

For now, the plot may look like something only a sabermetrician could read, but by the end of this book, complex graph matrices will be only one of many advanced topics in matplotlib that will have you reaching for new heights.

One last teaser before we close out the chapter—you may have noticed that the plots for the baseball data took a while to generate. Imagine doing 1,000 of these. Or 1,000,000. Traditionally, that's a showstopper for matplotlib projects, but in the latter half of this book, we will cover material that will not only show you how to overcome that limit, but also offer you several options to make it happen.

It's going to be a wild ride.

What you will learn

Analyze the matplotlib code base and its internals

Rerender visualized data on the fly based on changes in the user interface

Take advantage of sophisticated thirdparty libraries to plot complex data relationships

Create custom styles for use in specialize publications, presentations, or online media

Generate consolidated master plots comprising many subplots for dashboardlike results

Deploy matplotlib in Cloud environments

Utilize matplotlib in big data projects

What do you get with Print?

Instant access to your digital copy whilst your Print order is Shipped

Paperback book shipped to your preferred address

Redeem a companion digital copy on all Print orders

Access this title in our online reader with advanced features

DRM FREE - Read whenever, wherever and however you want

Frequently bought together

$43.99

$49.99

Learning SciPy for Numerical and Scientific Computing Second Edition

$32.99

Total $ 126.97

Filter reviews by

All

Amazon verified reviews

Justin Marley Nov 24, 2017

Very good book with a lot of material including code, graphs and hardware/cloud discussion to take the reader to the next level.

Amazon Verified review

Loris Aug 11, 2015

I bought this book with the goal of improving my basic knowledge of matplotlib: while the creation of basic plots is straightforward, I found it difficult to modify the default behaviors of the library, and I faced problems in plotting large amount of data. I can safely say that this book helped me to solve both issues. As the author mentions in the book, a prior knowledge of matplotlib will definitely makes the reading more enjoiable.The book begins by giving an historical overview of matplotlib and by introducing two popular projects, seaborn and pandas. The chapters that follow describes the matplotlib internal architecture and its API. Next the author illustrates how events are handled in matplotlib and how to create interactive plots. The fifth chapter is dedicated to high-level plotting and shows how to create plots with third-party libraries, such as networkX, pandas, and seaborn, which wrap matplotlib functionality. The chapter also briefly introduces Bokeh, a library that offers a series of improvements over matplotlib and focus its attention on the web browser. In this chapter the author did a very good job in showing how a good visualization of the data is crucial in data analysis. The next chapter covers the customization (and the configuration) of matplotlib. Here the author shows how to create complex layouts where different plots are combined in the same figure. In the eight chapter the author explains how to plot huge amount of data by illustrating different strategies that range from using tools such as numpy's memmap function and pytables, to decimating data (removal of a fraction of the data). The last chapter shows how it is possible to improve the performance of matplotlib by using a clustered environment.Last but not least, the authors provided a GitHub repository with the example code and notebooks of each chapter of the book.

Ro'eh Nävee May 12, 2017

'Mastering matplotlib' is a textbook for someone acquainted with using the Python library named 'matplotlib.' Also the IPython/Jupyter interpreter, since the 'nbagg' IPython Notebook backend to matplotlib will also be used. The 'nbagg' backend is for working with plots in a Web browser. Other Python libraries were SciPy, NumPy, Pandas and Seaborn.Most Jupyter notebooks in the tutorials start with:import matplotlibmatplotlib.use('nbagg')%matplotlib inlineimport matplotlib.pyplot as pltOne convenience to appreciate is how the author listed dependent libraries featured in the textbook, and even stored them in a text file included within the accompanying code, which can be downloaded for free from Packt Publishing. The generous author even states: "...you are welcome to...utilize the matplotlib library and the provided code in whatever way you see fit." Free practical, usable code. Once the free code has been downloaded, the github.com repo masteringmatplotlib/notebooks is optional. Mastering matplotlib

G. A. Patino Aug 12, 2015

This is the definite book to learn the most advanced aspects of matplotlib. Even though the book covers installation and gives the fundamentals about basic plots, it is best suited for intermediate users in both Python and matplotlib. In fact, the authors are very class-oriented, making a fair familiarity with object-oriented programming something of a prerequisite. However, if you have those prerequisites the book is a worthy investment as you will be able to take full advantage of matplotlib.Even though the topics covered are advanced, from the matplotlib architecture to deploying it in Docker and implementing in parallel computing, they are presented in a clear and concise way. Yet the breadth of applications covered is quite comprehensive, and the authors are able to articulate all the different chapters so that the learning feels like a natural progression instead of trying to cram very disparate subjects. The code is elegant and relatively short, facilitating its reading; and the authors explanations for it are very easy to follow. In particular, the chapter of big data visualization definitely goes beyond what is presented in other books that also cover the same topic, and the implementation explanations are much better. The fact that the book is under 300 pages long is a huge plus. The only chapter I felt that wasn't as easy to follow is the one on GUI deployment.One aspect I really enjoyed about the book is the multiple explanations about the different approaches to creating figures with matplotlib. When you are learning Python and matplotlib you see some books that use pyplot, while others use pylab. Or some like the ax. synthax while others stick to the plt. one. This is the first book in which I see a presentation of all those possibilities, along with their advantages and disadvantages. By the same token, if you are confused as to when to use Seaborn vs yhat ggplot, what's the point of NetworkX, what is ModGrapher, etc. you will find all those explanations here, along with suggestions for their appropriate application.

NRK Aug 14, 2018

I was hoping for a systematic understanding of the package. Didn't get it.

Mastering Matplotlib: A practical guide that takes you beyond the basics of matplotlib and gives solutions to plot complex data

What do you get with Print?

Mastering Matplotlib

Chapter 1. Getting Up to Speed

A brief historical overview of matplotlib

What's new in matplotlib 1.4

The intermediate matplotlib user

Prerequisites for this book

Python 3

Coding style

Installing matplotlib

Using IPython Notebooks with matplotlib

Tip

Advanced plots – a preview

Setting up the interactive backend

Joint plots with Seaborn

Scatter plot matrix graphs with Pandas

Summary

Page 1 of 12

Description

Who is this book for?

What you will learn

Product Details

What do you get with Print?

Product Details

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

Filter reviews by

People who bought this also bought

About the author

FAQs

Mastering Matplotlib: A practical guide that takes you beyond the basics of matplotlib and gives solutions to plot complex data

What do you get with Print?

Contact Details

Shipping Address

Billing Address

Description

Who is this book for?

What you will learn

Product Details

What do you get with Print?

Contact Details

Shipping Address

Billing Address

Product Details

Packt Subscriptions

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

Filter reviews by

People who bought this also bought

About the author

FAQs

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access