Reader small image

You're reading from  Matplotlib for Python Developers. - Second Edition

Product typeBook
Published inApr 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781788625173
Edition2nd Edition
Languages
Right arrow
Authors (3):
Aldrin Yim
Aldrin Yim
author image
Aldrin Yim

Aldrin Yim is a PhD candidate and Markey Scholar in the Computation and System Biology program at Washington University, School of Medicine. His research focuses on applying big data analytics and machine learning approaches in studying neurological diseases and cancer. He is also the founding CEO of Codex Genetics Limited, which provides precision medicine solutions to patients and hospitals in Asia.
Read more about Aldrin Yim

Claire Chung
Claire Chung
author image
Claire Chung

Claire Chung is pursuing her PhD degree as a Bioinformatician at the Chinese University of Hong Kong. She enjoys using Python daily for work and lifehack. While passionate in science, her challenge-loving character motivates her to go beyond data analytics. She has participated in web development projects, as well as developed skills in graphic design and multilingual translation. She led the Campus Network Support Team in college, and shared her experience in data visualization in PyCon HK 2017.
Read more about Claire Chung

Allen Yu
Allen Yu
author image
Allen Yu

Allen Yu, PhD, is a Chevening Scholar, 2017-18, and an MSC student in computer science at the University of Oxford. He holds a PhD degree in Biochemistry from the Chinese University of Hong Kong, and he has used Python and Matplotlib extensively during his 10 years of bioinformatics experience.
Read more about Allen Yu

View More author details
Right arrow

What is Matplotlib?


Matplotlib is a Python package for data visualization. It allows easy creation of various plots, including line, scattered, bar, box, and radial plots, with high flexibility for refined styling and customized annotation. The versatile artist module allows developers to define basically any kind of visualization. For regular usage, Matplotlib offers a simplistic object-oriented interface, the pyplot module, for easy plotting.

Besides generating static graphics, Matplotlib also supports an interactive interface which not only aids in creating a wide variety of plots but is also very useful in creating web-based applications.

Matplotlib is readily integrated into popular development environments, such as Jupyter Notebook, and it supports many more advanced data visualization packages.

Merits of Matplotlib

There are many advantages in creating data visualization with code so that the visualization streamlines into part of the result generation pipeline. Let's have a look at some of the key advantages of the Matplotlib library. 

Easy to use

The Matplotlib plotting library is easy to use in several ways:

  • Firstly, the object-oriented module structures simplify the plotting process. More often than not, we're only required to call import maplotlib.pyplot as plt to import the plotting API to create and customize many basic plots.
  • Matplotlib is highly integrated with two common data analytics packages, pandas and NumPy. For example, we can simply append .plot() to a pandas DataFrame such as by df.plot() to create a simple plot, and customize its styling with Matplotlib syntax.
  • For styling, Matplotlib offers functions to alter the appearance of each feature, and ready-made default style sheets are also available to avoid these extra steps when refined aesthetics is not required.

Diverse plot types

Often in data analytics, we need sophisticated plots to express our data. Matplotlib offers numerous plotting APIs natively, and is also the basis for a collection of third-party packages for additional functionalities, including:

  • Seaborn: Provides simple plotting APIs, including some advanced plot types, with aesthetically appealing default styling
  • HoloViews: Creates interactive plots with metadata annotation from bundled data 
  • Basemap/GeoPandas/Canopy: Maps data values to colors on geographical maps

We would learn some of the applications of these third-party packages in later chapters on advanced plotting.

Hackable to the core (only when you want)

When we want to go beyond the default settings to ensure that the resultant figure meets our specific purpose, we can customize the appearance and behaviors of each plot feature:

  • Per-element styling is possible
  • The ability to plot data values as colors and draw any shape of patches allows the creation of almost any kind of visualization
  • Useful in customizing plots created by extensions such as Seaborn

Open source and community support

As Matplotlib is open source, it enables developers and data analysts to use it for free. The users also have the freedom to improve and contribute to the Matplotlib library. As part of the open source experience, the users get prompt online support from the members of the global community on various platforms and forums.

What's new in Matplotlib 2.x?

Matplotlib supports Python 3 since version 1.2, released in 2013. The Matplotlib 2.0 release introduced a number of changes and upgrades to improve data visualization projects. Let us look at some of the key improvements and upgrades. 

Improved functionality and performance

Matplotlib 2.0 presents new features that improve user experience, including speed, and output quality, as well as resource usage.

Improved color conversion API and RGBA support

The alpha channel that specifies the transparency level is fully supported in Matplotlib 2.0.

Improved image support

Matplotlib 2.0 now resamples images with less memory and less data type conversion.

Faster text rendering

Community developers claim that the speed of text rendering by the Agg backend has improved by 20%.

Change in the default animation codec

A very efficient codec, H.264, is now used as the default, which replaces MPEG-4, to generate video output for animated plots. With H.264, we can now have longer video record time and lesser data traffic and loading time thanks to the higher compression rate and smaller output file size. It is also noted that real-time playback of H.264 videos is better than those encoded in MPEG-4.

Changes in default styles

There are a number of style changes for improved visualization, such as more intuitive colors by default. We will discuss more in the chapter on figure aesthetics.

For details on all Matplotlib updates, you may visit http://matplotlib.org/devdocs/users/whats_new.html.

Matplotlib website and online documentation

As developers, you probably recognize the importance of reading documentation and manuals to get acquainted with syntax and functionality. We would like to reiterate the importance of reading the library documentation and encourage you to do the same. You can find the documentation here: https://matplotlib.org. On the official Matplotlib website, you would find the documentation for each function, news of latest releases and ongoing development, and a list of third-party packages, as well as tutorials and galleries of example plots.

However, building advanced and sophisticated plots by reading through documentation from scratch means a much steeper learning curve and a lot more time spent, especially when the documentation is regularly updated for better comprehension. This book aims to provide the reader with a guided road-map to accelerate the learning process, save time and effort, and put theory into practice. The online manuals can serve as the atlases you can turn to whenever you want to explore further.

The Matplotlib source code is available on GitHub at https://github.com/matplotlib/matplotlib. We encourage our readers to fork it and add their ideas!

Output formats and backends 

Matplotlib enables users to obtain output plots as static figures. The plots can also be piped and made responsive through interactive backends. 

Static output formats

Static images are the most commonly used output format for reporting and presentation purposes, and for our own quick inspection of data. Static images can be classified into two distinct categories.

Raster images

Raster is the classic image format that provides support to a wide variety of image files, including PNG, JPG and BMP. Each raster image can be seen as a dense array of color values. For raster images, resolution matters.

The amount of image details kept is measured in dots per inch (DPI). The higher the DPI value (that is, the more pixel dots kept in it), the clearer the resultant image would be, even when stretched to a larger size. Of course, the file size and computational resources needed for the rendering would increase accordingly.

Vector images

For vector images, instead of a matrix of discrete color dots, information is saved as paths, which are lines joining dots. They scale without losing any details:

  • SVG
  • PDF
  • PS
Previous PageNext Page
You have been reading a chapter from
Matplotlib for Python Developers. - Second Edition
Published in: Apr 2018Publisher: PacktISBN-13: 9781788625173
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (3)

author image
Aldrin Yim

Aldrin Yim is a PhD candidate and Markey Scholar in the Computation and System Biology program at Washington University, School of Medicine. His research focuses on applying big data analytics and machine learning approaches in studying neurological diseases and cancer. He is also the founding CEO of Codex Genetics Limited, which provides precision medicine solutions to patients and hospitals in Asia.
Read more about Aldrin Yim

author image
Claire Chung

Claire Chung is pursuing her PhD degree as a Bioinformatician at the Chinese University of Hong Kong. She enjoys using Python daily for work and lifehack. While passionate in science, her challenge-loving character motivates her to go beyond data analytics. She has participated in web development projects, as well as developed skills in graphic design and multilingual translation. She led the Campus Network Support Team in college, and shared her experience in data visualization in PyCon HK 2017.
Read more about Claire Chung

author image
Allen Yu

Allen Yu, PhD, is a Chevening Scholar, 2017-18, and an MSC student in computer science at the University of Oxford. He holds a PhD degree in Biochemistry from the Chinese University of Hong Kong, and he has used Python and Matplotlib extensively during his 10 years of bioinformatics experience.
Read more about Allen Yu