In Mastering pandas for Finance, we will examine the use of pandas to manage financial data and perform various financial analyses with a specific focus on financial processes that can be facilitated using the capabilities provided within pandas, along with an occasional quantitative financial technique. I have made an assumption that you have basic knowledge of Python programming and have used IPython and IPython Notebooks. Knowledge of pandas is preferred, but we will cover enough information on pandas for any reader to be able to understand the technique being used. We will occasionally and briefly touch upon areas of quantitative finance, but those times will be mostly for information purposes and will have implementations that are provided in the code of the text.
During this voyage of discovery, we will begin with an overview/review of concepts and data structures in pandas that are of importance to financial analysis. We will then move into various concepts, techniques, tools, and examples of specific financial analysis problems as solved with Python, pandas, and several other Python libraries and tools, including Wakari, matplotlib, SciPy, Quandl, Zipline, and Mibian. These will be varied in nature, and topics ranging from analysis of historical stock data, correlating search data with trends in stock prices, algorithmic trading and backtesting, options modeling and pricing, and portfolio and risk analysis will be covered.
In this first chapter, we will walk through creating an account and environment in Wakari.io and installing the code samples into that environment. I have chosen Wakari.io as a basis for a pandas-based financial environment because it is relatively painless to get up and running with all of the tools we will utilize, and also the samples provided in the code bundle of this book are in the IPython Notebook format, which is simple to use within Wakari.io.
The use of Wakari, however, does not prevent you from using your own Python environment. The examples in the text will run in any Python environment and were originally built using the Anaconda and IPython Notebook formats with all of the mentioned tools installed within the environment. Just in case you don't want to use Wakari, all the code examples in the text are presented as IPython and will run in a properly configured IPython environment.
So, let's get started. In this chapter, we will cover the following topics:
What is Wakari.io?
Creating a Wakari account
Updating the default Wakari environment to run all our examples
Installing and running the code samples in Wakari
Wakari (http://continuum.io/wakari) is a collaborative data analytics platform that allows you to explore data and create analytic scripts in collaboration with IPython Notebooks. It is an offering of Continuum Analytics, the creators of the Anaconda Python distribution, which is generally considered to be one of the best Python distributions. Wakari is offered as a solution that you can run in your enterprise at an expense, or as a web- or cloud-based solution offered on a freemium basis. The following screenshot shows Wakari as an offering of Continuum Analytics:
The approach in this text will be to guide you in using the cloud-based Wakari solution. This environment provides an effective quick start to learning pandas and performing all the data analysis in this text but with very minimal effort in managing a local Python installation.
The cloud-based offering for Wakari is available at https://wakari.io. For convenience, from this point on, I will refer to Wakari.io as Wakari, but always know that I am referring to the cloud-based solution.
Wakari is a freemium service that allows you to run web-based Python distributions. Specifics on the free part of the freemium services can be found on the site, but all of the examples in this text can be run for free in the Wakari environment (at least at the time of writing this book). Wakari offers very low resistance to success in learning all of the concepts in this text as well as many others.
The guidance in this chapter will take you through creating and setting up an online Python environment, which can run all of the examples in this book. To start, open your browser and enter
https://wakari.io in the address bar. This will display the following page:
IPython Notebooks are a default feature in Wakari for the purpose of developing Python applications. All the examples in this book were developed as IPython Notebooks, although the code can be run sequentially in IPython or even Python. An advantage of IPython Notebooks is the ability to intermix markdown with Python code within a semi-dynamic web page, which allows easy reuse of code, and perhaps more importantly, publishing of code on the Web.
As a matter of fact, you can find all the code files for this book on Wakari at https://wakari.io/sharing/bundle/Pandas4Finance/MasteringPandas4Finance_Index.
At the time of writing this book, the default Python environment provided by Wakari is Python 2.7.9, and more specifically, Anaconda 1.9.1 (all version numbers are at the time of writing, so when you read this, they may be newer). This is, in general, a good environment for what we want to accomplish in this book, although a few packages need updating and several others need to be installed. In Wakari, pandas is currently at 0.16.0, which is satisfactory for our needs.
The specific packages that either need updating or installing are as follows:
We will go over each of these briefly and also see how to install/update each. In general, the update/install process will be performed using a shell within Wakari. One of the spectacular features of Wakari includes running both interactive IPython sessions and operating system shells directly in the browser.
From a new environment within Wakari, you can open terminals using the Terminals tab. Click on the Terminals tab, and you will see the following screenshot, which represents a default IPython shell for your account (currently referred to as
You can perform any Python programming within this web-based interface, including all of the examples in this book. However, the default Wakari environment needs a few updates and first-time installs to run all of the examples in the text.
We can perform updates to the environment by opening a shell. This can be performed by selecting Shell from the drop-down menu, along with np18py27-1.9, and pressing the +Tab button. After that, you will be presented with the following screenshot:
We need to update one package in the default Wakari environment—matplotlib. This is the graphics package we will use at various points in this book. For most of the purposes, the version in Wakari (1.3.1) is satisfactory, but the candlestick charts that we will create require an update to matplotlib from 1.3.1 to a higher version. This is performed with the
conda package manager using the
conda update matplotlib command. When issuing this, you will see something similar to the following in the terminal tab in your web browser:
The remainder of the packages need to be installed. All these package installations follow the same process, although there are slightly different commands, which alternate between using
pip and the
conda package manager for installation.
For time zone operations, tzlocal is used and is updated using
pip. The installation is performed as shown here:
The samples do not use html5lib directly, but other libraries do use it indirectly. We will use these libraries to read and parse data. We need to update this using
conda, as shown here:
A library provided at https://www.quandl.com/, Quandl is a provider of data that you can integrate into your applications via download or the API. The Python API that we will use to access S&P 500 data is free and can be installed using
conda, as shown here:
Available at https://www.quantopian.com/, Zipline is a backtesting/trading simulator that we will use. Quantopian is a website that focuses on algorithmic trading, and it produces Zipline, which it uses as one of its underlying technologies. Although installed using
conda, Zipline requires the use of a different channel. Notice the slight variation in the use of
conda to specify the Quantopian channel in the following screenshot:
We are now ready to run any of the sample Notebooks.
Downloading the example code
You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
To upload the files to Wakari, click on the upload files icon and drag the files into the Drag & Drop Here section of the web page:
At this point, you should be able to open and run any of the Notebooks and even examine the data in the browser. As an example, the following screenshot demonstrates the Notebook for Chapter 2, Introducing the Series and DataFrame, opened in Wakari:
This chapter was a brief introduction to this book. You learned how to set up a Python environment in Wakari.io to be able to run the code samples provided throughout the text. This included instructions on how to update the default Wakari.io Python environment to support the required packages that are required for all of the examples in the remainder of the text.
In the next chapter, we will dive into using pandas and its core data structures,
DataFrame. These will be core to representing data in later chapters, where we primarily use pandas
DataFrame objects to represent financial data, which we apply to various financial analyses.