Mastering pandas for Finance

3 (2 reviews total)
By Michael Heydt
    Advance your knowledge in tech with a Packt subscription

  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Getting Started with pandas Using

About this book

This book will teach you to use Python and the Python Data Analysis Library (pandas) to solve real-world financial problems.

Starting with a focus on pandas data structures, you will learn to load and manipulate time-series financial data and then calculate common financial measures, leading into more advanced derivations using fixed- and moving-windows. This leads into correlating time-series data to both index and social data to build simple trading algorithms. From there, you will learn about more complex trading algorithms and implement them using open source back-testing tools. Then, you will examine the calculation of the value of options and Value at Risk. This then leads into the modeling of portfolios and calculation of optimal portfolios based upon risk. All concepts will be demonstrated continuously through progressive examples using interactive Python and IPython Notebook.

By the end of the book, you will be familiar with applying pandas to many financial problems, giving you the knowledge needed to leverage pandas in the real world of finance.

Publication date:
May 2015


Chapter 1. Getting Started with pandas Using

In Mastering pandas for Finance, we will examine the use of pandas to manage financial data and perform various financial analyses with a specific focus on financial processes that can be facilitated using the capabilities provided within pandas, along with an occasional quantitative financial technique. I have made an assumption that you have basic knowledge of Python programming and have used IPython and IPython Notebooks. Knowledge of pandas is preferred, but we will cover enough information on pandas for any reader to be able to understand the technique being used. We will occasionally and briefly touch upon areas of quantitative finance, but those times will be mostly for information purposes and will have implementations that are provided in the code of the text.

During this voyage of discovery, we will begin with an overview/review of concepts and data structures in pandas that are of importance to financial analysis. We will then move into various concepts, techniques, tools, and examples of specific financial analysis problems as solved with Python, pandas, and several other Python libraries and tools, including Wakari, matplotlib, SciPy, Quandl, Zipline, and Mibian. These will be varied in nature, and topics ranging from analysis of historical stock data, correlating search data with trends in stock prices, algorithmic trading and backtesting, options modeling and pricing, and portfolio and risk analysis will be covered.

In this first chapter, we will walk through creating an account and environment in and installing the code samples into that environment. I have chosen as a basis for a pandas-based financial environment because it is relatively painless to get up and running with all of the tools we will utilize, and also the samples provided in the code bundle of this book are in the IPython Notebook format, which is simple to use within

The use of Wakari, however, does not prevent you from using your own Python environment. The examples in the text will run in any Python environment and were originally built using the Anaconda and IPython Notebook formats with all of the mentioned tools installed within the environment. Just in case you don't want to use Wakari, all the code examples in the text are presented as IPython and will run in a properly configured IPython environment.

So, let's get started. In this chapter, we will cover the following topics:

  • What is

  • Creating a Wakari account

  • Updating the default Wakari environment to run all our examples

  • Installing and running the code samples in Wakari


What is Wakari?

Wakari ( is a collaborative data analytics platform that allows you to explore data and create analytic scripts in collaboration with IPython Notebooks. It is an offering of Continuum Analytics, the creators of the Anaconda Python distribution, which is generally considered to be one of the best Python distributions. Wakari is offered as a solution that you can run in your enterprise at an expense, or as a web- or cloud-based solution offered on a freemium basis. The following screenshot shows Wakari as an offering of Continuum Analytics:

The approach in this text will be to guide you in using the cloud-based Wakari solution. This environment provides an effective quick start to learning pandas and performing all the data analysis in this text but with very minimal effort in managing a local Python installation.


Creating a Wakari cloud account

The cloud-based offering for Wakari is available at For convenience, from this point on, I will refer to as Wakari, but always know that I am referring to the cloud-based solution.

Wakari is a freemium service that allows you to run web-based Python distributions. Specifics on the free part of the freemium services can be found on the site, but all of the examples in this text can be run for free in the Wakari environment (at least at the time of writing this book). Wakari offers very low resistance to success in learning all of the concepts in this text as well as many others.

The guidance in this chapter will take you through creating and setting up an online Python environment, which can run all of the examples in this book. To start, open your browser and enter in the address bar. This will display the following page:

Sign up for a new account, and upon successful registration for the service, you will be presented with the following web interface to manage IPython Notebooks:

IPython Notebooks are a default feature in Wakari for the purpose of developing Python applications. All the examples in this book were developed as IPython Notebooks, although the code can be run sequentially in IPython or even Python. An advantage of IPython Notebooks is the ability to intermix markdown with Python code within a semi-dynamic web page, which allows easy reuse of code, and perhaps more importantly, publishing of code on the Web.

As a matter of fact, you can find all the code files for this book on Wakari at

At the time of writing this book, the default Python environment provided by Wakari is Python 2.7.9, and more specifically, Anaconda 1.9.1 (all version numbers are at the time of writing, so when you read this, they may be newer). This is, in general, a good environment for what we want to accomplish in this book, although a few packages need updating and several others need to be installed. In Wakari, pandas is currently at 0.16.0, which is satisfactory for our needs.

The specific packages that either need updating or installing are as follows:

  • matplotlib

  • Zipline

  • Quandl

  • html5lib

  • Mibian

  • tzlocal

We will go over each of these briefly and also see how to install/update each. In general, the update/install process will be performed using a shell within Wakari. One of the spectacular features of Wakari includes running both interactive IPython sessions and operating system shells directly in the browser.

From a new environment within Wakari, you can open terminals using the Terminals tab. Click on the Terminals tab, and you will see the following screenshot, which represents a default IPython shell for your account (currently referred to as np18py27-19):

You can perform any Python programming within this web-based interface, including all of the examples in this book. However, the default Wakari environment needs a few updates and first-time installs to run all of the examples in the text.

We can perform updates to the environment by opening a shell. This can be performed by selecting Shell from the drop-down menu, along with np18py27-1.9, and pressing the +Tab button. After that, you will be presented with the following screenshot:

We are now in an OS shell that provides you with many options, including updating your Python environment, which we will now perform.

Updating existing packages

We need to update one package in the default Wakari environment—matplotlib. This is the graphics package we will use at various points in this book. For most of the purposes, the version in Wakari (1.3.1) is satisfactory, but the candlestick charts that we will create require an update to matplotlib from 1.3.1 to a higher version. This is performed with the conda package manager using the conda update matplotlib command. When issuing this, you will see something similar to the following in the terminal tab in your web browser:

Installing new packages

The remainder of the packages need to be installed. All these package installations follow the same process, although there are slightly different commands, which alternate between using pip and the conda package manager for installation.

For time zone operations, tzlocal is used and is updated using pip. The installation is performed as shown here:

The samples do not use html5lib directly, but other libraries do use it indirectly. We will use these libraries to read and parse data. We need to update this using conda, as shown here:

A library provided at, Quandl is a provider of data that you can integrate into your applications via download or the API. The Python API that we will use to access S&P 500 data is free and can be installed using conda, as shown here:

Available at, Zipline is a backtesting/trading simulator that we will use. Quantopian is a website that focuses on algorithmic trading, and it produces Zipline, which it uses as one of its underlying technologies. Although installed using conda, Zipline requires the use of a different channel. Notice the slight variation in the use of conda to specify the Quantopian channel in the following screenshot:

The final package we need to install is Mibian, a small library that computes Black-Scholes and its derivatives. This is installed using pip, as shown here:

We are now ready to run any of the sample Notebooks.


Downloading the example code

You can download the example code files from your account at for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit and register to have the files e-mailed directly to you.


Installing the samples in Wakari

To install the examples in Wakari, download the code bundle and unzip the files to a local directory. You will see a set of files as shown here:

To upload the files to Wakari, click on the upload files icon and drag the files into the Drag & Drop Here section of the web page:

Once dropped, click on the Upload Files button, and you will see the following files in your Wakari directory:

At this point, you should be able to open and run any of the Notebooks and even examine the data in the browser. As an example, the following screenshot demonstrates the Notebook for Chapter 2, Introducing the Series and DataFrame, opened in Wakari:



This chapter was a brief introduction to this book. You learned how to set up a Python environment in to be able to run the code samples provided throughout the text. This included instructions on how to update the default Python environment to support the required packages that are required for all of the examples in the remainder of the text.

In the next chapter, we will dive into using pandas and its core data structures, Series and DataFrame. These will be core to representing data in later chapters, where we primarily use pandas DataFrame objects to represent financial data, which we apply to various financial analyses.

About the Author

  • Michael Heydt

    Michael Heydt is an independent consultant, programmer, educator, and trainer. He has a passion for learning and sharing his knowledge of new technologies. Michael has worked in multiple industry verticals, including media, finance, energy, and healthcare. Over the last decade, he worked extensively with web, cloud, and mobile technologies and managed user experiences, interface design, and data visualization for major consulting firms and their clients. Michael's current company, Seamless Thingies , focuses on IoT development and connecting everything with everything.

    Michael is the author of numerous articles, papers, and books, such as D3.js By Example, Instant Lucene. NET, Learning Pandas, and Mastering Pandas for Finance, all by Packt. Michael is also a frequent speaker at .NET user groups and various mobile, cloud, and IoT conferences and delivers webinars on advanced technologies.

    Browse publications by this author

Latest Reviews

(2 reviews total)
The book uses in it's entirety - this book is completely useless to anyone now as we cannot follow the training - no longer exists!
Definitely a welcome book in my arsenal of texts for Python in the finance domain. The book's examples do a good job of illustrating how to apply pandas--and the examples given are practical, which makes the book even more useful.

Recommended For You

Mastering pandas for Finance
Unlock this book and the full library for FREE
Start free trial