You're reading from Hands-On Data Analysis with Pandas - Second Edition

Product typeBook

Published inApr 2021

Reading LevelIntermediate

PublisherPackt

ISBN-139781800563452

Edition2nd Edition

Languages

Python

Tools

Pandas

Concepts

Databases

Author (1)

Stefanie Molin

Chapter 7: Financial Analysis – Bitcoin and the Stock Market

It's time to switch gears and work on an application. In this chapter, we will explore a financial application by performing an analysis of bitcoin and the stock market. This chapter builds upon everything we have learned so far—we will extract data from the Internet; perform some exploratory data analysis; create visualizations with pandas, seaborn, and matplotlib; calculate important metrics for analyzing the performance of financial instruments using pandas; and get a taste of building some models. Note that we are not trying to learn financial analysis here, but rather walk through an introduction of how the skills we have learned in this book can be applied to financial analysis.

This chapter is also a departure from the standard workflow in this book. Up until this point, we have been working with Python as more of a functional programming language. However, Python also supports object-oriented...

Chapter materials

For this chapter, we will be creating our own package for stock analysis. This makes it extremely easy for us to distribute our code and for others to use our code. The final product of this package is on GitHub at https://github.com/stefmolin/stock-analysis/tree/2nd_edition. Python's package manager, pip, is capable of installing packages from GitHub and also building them locally; this leaves us with either of the following choices as to how we want to proceed:

Install from GitHub if we don't plan on editing the source code for our own use.
Fork and clone the repository and then install it on our machine in order to modify the code.

If we wish to install from GitHub directly, we don't need to do anything here since this was installed when we set up our environment back in Chapter 1, Introduction to Data Analysis; however, for reference, we would do the following to install packages from GitHub:

(book_env) $ pip3 install \
git...

Building a Python package

Building packages is considered good coding practice since it allows for writing modular code and reuse. Modular code is code that is written in many smaller pieces for more pervasive use, without needing to know the underlying implementation details of everything involved in a task. For example, when we use matplotlib to plot something, we don't need to know what the code inside the functions we call is doing exactly—it suffices to simply know what the input and output will be to build on top of it.

Package structure

A module is a single file of Python code that can be imported; window_calc.py from Chapter 4, Aggregating Pandas DataFrames, and viz.py from Chapter 6, Plotting with Seaborn and Customization Techniques, were both modules. A package is a collection of modules organized into directories. Packages can also be imported, but when we import a package we have access to certain modules inside, so we don't have to import each one...

Collecting financial data

Back in Chapter 2, Working with Pandas DataFrames, and Chapter 3, Data Wrangling with Pandas, we worked with APIs to gather data; however, there are other ways to collect data from the Internet. We can use web scraping to extract data from the HTML page itself, which pandas offers with the pd.read_html() function—it returns a dataframe for each of the HTML tables it finds on the page. For economic and financial data, an alternative is the pandas_datareader package, which the StockReader class in the stock_analysis package uses to collect financial data.

Important note

In case anything has changed with the data sources that are used in this chapter or you encounter errors when using the StockReader class to collect data, the CSV files in the data/ folder can be read in as a replacement in order to follow along with the text; for example:

pd.read_csv('data/bitcoin.csv', index_col='date',  parse_dates=True...

Exploratory data analysis

Now that we have our data, we want to get familiar with it. As we saw in Chapter 5, Visualizing Data with Pandas and Matplotlib and Chapter 6, Plotting with Seaborn and Customization Techniques, creating good visualizations requires knowledge of matplotlib, and—depending on the data format and the end goal for the visualization—seaborn. Just as we did with the StockReader class, we want to make it easier to visualize both individual assets and groups of assets, so rather than expecting users of our package (and, perhaps, our collaborators) to be proficient with matplotlib and seaborn, we will create wrappers around this functionality. This means that users of this package only have to be able to use the stock_analysis package to visualize their financial data. In addition, we are able to set a standard for how the visualizations look and avoid copying and pasting large amounts of code for each new analysis we want to conduct, which brings consistency...

Technical analysis of financial instruments

With technical analysis of assets, metrics (such as cumulative returns and volatility) are calculated to compare various assets to each other. As with the previous two sections in this chapter, we will be writing a module with classes to help us. We will need the StockAnalyzer class for technical analysis of a single asset and the AssetGroupAnalyzer class for technical analysis of a group of assets. These classes are in the stock_analysis/stock_analyzer.py file.

As with the other modules, we will start with our docstring and imports:

"""Classes for technical analysis of assets."""
import math
from .utils import validate_df

The StockAnalyzer class

For analyzing individual assets, we will build the StockAnalyzer class, which calculates metrics for a given asset. The following UML diagram shows all the metrics that it provides:

Figure 7.19 – Structure of the StockAnalyzer...

Modeling performance using historical data

The goal of this section is to give us a taste of how to build some models; as such, the following examples are not meant to be the best possible model, but rather a simple and relatively quick implementation for learning purposes. Once again, the stock_analysis package has a class for this section's task: StockModeler.

Important note

To fully understand the statistical elements of this section and modeling in general, we need a solid understanding of statistics; however, the purpose of this discussion is to show how modeling techniques can be applied to financial data without dwelling on the underlying mathematics.

The StockModeler class

The StockModeler class will make it easier for us to build and evaluate some simple financial models without needing to interact directly with the statsmodels package. In addition, we will reduce the number of steps that are needed to generate a model with the methods we create. The following...

Summary

In this chapter, we saw how building Python packages for our analysis applications can make it very easy for others to carry out their own analyses and reproduce ours, as well as for us to create repeatable workflows for future analyses.

The stock_analysis package we created in this chapter contained classes for gathering stock data from the Internet (StockReader); visualizing individual assets or groups of them (Visualizer family); calculating metrics for single assets or groups of them for comparisons (StockAnalyzer and AssetGroupAnalyzer, respectively); and time series modeling with decomposition, ARIMA, and linear regression (StockModeler). We also got our first look at using the statsmodels package in the StockModeler class. This chapter showed us how the pandas, matplotlib, seaborn, and numpy functionality that we've covered so far in this book has come together and how these libraries can work harmoniously with other packages for custom applications. I strongly...

Exercises

Use the stock_analysis package to complete the following exercises. Unless otherwise noted, use data from 2019 through the end of 2020. In case there are any issues collecting the data with the StockReader class, backup CSV files are provided in the exercises/ directory:

Using the StockAnalyzer and StockVisualizer classes, calculate and plot three levels of support and resistance for Netflix's closing price.
With the StockVisualizer class, look at the effect of after-hours trading on the FAANG stocks:
a) As individual stocks
b) As a portfolio using the make_portfolio() function from the stock_analysis.utils module
Using the StockVisualizer.open_to_close() method, create a plot that fills the area between the FAANG stocks' opening price (as a portfolio) and its closing price each day in red if the price declined and in green if the price increased. As a bonus, do the same for a portfolio of bitcoin and the S&P 500.
Mutual funds and exchange...

A guide to Python's function decorators: https://www.thecodeship.com/patterns/guide-to-python-function-decorators/
Alpha: https://www.investopedia.com/terms/a/alpha.asp
An Introduction to Classes and Inheritance (in Python): http://www.jesshamrick.com/2011/05/18/an-introduction-to-classes-and-inheritance-in-python/
Beta: https://www.investopedia.com/terms/b/beta.asp
Coefficient of Variation (CV): https://www.investopedia.com/terms/c/coefficientofvariation.asp
Classes (Python Documentation): https://docs.python.org/3/tutorial/classes.html
How After-Hours Trading Affects Stock Prices: https://www.investopedia.com/ask/answers/05/saleafterhours.asp
How to Create a Python Package: https://www.pythoncentral.io/how-to-create-a-python-package/
How to Create an ARIMA Model for Time Series Forecasting in Python: https://machinelearningmastery...

The rest of the chapter is locked

You have been reading a chapter from

Hands-On Data Analysis with Pandas - Second Edition

Published in: Apr 2021Publisher: PacktISBN-13: 9781800563452

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Stefanie Molin

Stefanie Molin is a data scientist and software engineer at Bloomberg LP in NYC, tackling tough problems in information security, particularly revolving around anomaly detection, building tools for gathering data, and knowledge sharing. She has extensive experience in data science, designing anomaly detection solutions, and utilizing machine learning in both R and Python in the AdTech and FinTech industries. She holds a B.S. in operations research from Columbia University's Fu Foundation School of Engineering and Applied Science, with minors in economics, and entrepreneurship and innovation. In her free time, she enjoys traveling the world, inventing new recipes, and learning new languages spoken among both people and computers.
Read more about Stefanie Molin

Other recommended products

Related to this chapter

Python Data Cleaning Cookbook

The book shows you how to view data from multiple perspectives, including data frame and column attributes. You will cover common and not-so-common challenges that are faced while cleaning messy data for complex situations. You will learn to manipulate data and get them down to a form that can be useful for making the right decisions.

BookDec 2020436 pages

Learning pandas

Pandas is a popular Python package used for practical, real world data analysis. It provides efficient fast, high-performance data structures that makes data exploration and analysis very easy. This learner's guide will help you through a comprehensive set of features provided by the pandas library to perform efficient data manipulation and analysis.

BookJun 2017446 pages

Machine Learning with scikit-learn Quick Start Guide

Scikit-learn is a robust machine learning library for the Python programming language. It provides a set of supervised and unsupervised learning algorithms. This book is the easiest way to learn how to deploy, optimize and evaluate all the important machine learning algorithms that scikit-learn provides.

BookOct 2018172 pages

Hands-On Exploratory Data Analysis with Python

This book provides practical knowledge about the main pillars of EDA including data cleaning, data preparation, data exploration, and data visualization. You can leverage the power of Python to understand, summarize and investigate your data in the best way possible. The book presents a unique approach to exploring hidden features in your data.

BookMar 2020352 pages

Become a Python Data Analyst

Become a Python Data Analyst book introduces you to the mainstream libraries of Python’s Data Science stack. With proven examples and real-world datasets, this book teaches how to effectively perform data manipulation, visualize and analyze data patterns and brings you to the ladder of advanced topics like Predictive Analytics.

BookAug 2018178 pages

Hands-On Financial Trading with Python

This book focuses on key Python analytics and algorithmic trading libraries used for backtesting. With the help of practical examples, you will learn the principle aspects of trading strategy development. The 14 profitable strategies included in the book will also help you build intuitions that will enable you to create your own strategy.

BookApr 2021360 pages

Hands-On Gradient Boosting with XGBoost and scikit-learn

This practical XGBoost guide will put your Python and scikit-learn knowledge to work by showing you how to build powerful, fine-tuned XGBoost models with impressive speed and accuracy. This book will help you to apply XGBoost’s alternative base learners, use unique transformers for model deployment, discover tips from Kaggle masters, and much more!

BookOct 2020310 pages

Pandas Cookbook

Explore pandas, the powerful Python library for data analysis and manipulation by working on real-world datasets. Get to grips with the fundamentals and learn to use pandas to clean messy data, independently analyze groups within your data, make powerful time-series calculations, and create beautiful visualizations during exploratory data analysis.

BookOct 2017532 pages

Pandas 1.x Cookbook

A new edition of the bestselling Pandas cookbook updated to pandas 1.x with new chapters on creating and testing, and exploratory data analysis. Recipes are written with modern pandas constructs. This book also covers EDA, tidying data, pivoting data, time-series calculations, visualizations, and more.

BookFeb 2020626 pages

Python for Finance Cookbook

Python is becoming the number one language for data science and also quantitative finance. This book provides you with solutions to common tasks from the intersection of quantitative finance and data science, using modern Python libraries.

BookJan 2020432 pages

scikit-learn Cookbook

scikit-learn has evolved as a robust library for machine learning applications in python with support for a wide range of supervised and unsupervised learning algorithms. This edition brings to you the various enhancements to its model implementations, API and bug fixes in the latest major release of scikit-learn to support Python. This book covers easy to follow recipes right from mathematical operations to implementing various supervised, unsupervised and deep learning algorithms with scikit-learn. Get practical hands-on knowledge to implement various models and algorithms like Multi-Layer Perceptrons, time-series split, MAE criterion for regression, criteria for gradient boosting, Classifier, Regressor, and much more.

BookNov 2017374 pages

Applied Supervised Learning with Python

Applied Supervised Learning with Python provides you a rich understanding of machine learning, one of the most pursued topics in information science, and Python, one of the most popular scripting languages. Through this book, you'll learn Jupyter Notebooks, the technology used in academic and commercial circles with in-line code running support.

BookApr 2019404 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages

You're reading from Hands-On Data Analysis with Pandas - Second Edition

Chapter 7: Financial Analysis – Bitcoin and the Stock Market

Chapter materials

Building a Python package

Package structure

Collecting financial data

Exploratory data analysis

Technical analysis of financial instruments

The StockAnalyzer class

Modeling performance using historical data

The StockModeler class

Summary

Exercises

Further reading

Unlock this book and the full library FREE for 7 days

Author (1)

Python Data Cleaning Cookbook

Learning pandas

Machine Learning with scikit-learn Quick Start Guide

Hands-On Exploratory Data Analysis with Python

Become a Python Data Analyst

Hands-On Financial Trading with Python

Hands-On Gradient Boosting with XGBoost and scikit-learn

Pandas Cookbook

Pandas 1.x Cookbook

Python for Finance Cookbook

Python is becoming the number one language for data science and also quantitative finance. This book provides you with solutions to common tasks from the intersection of quantitative finance and data science, using modern Python libraries.

scikit-learn Cookbook

Applied Supervised Learning with Python

Et al.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Mastering Tableau 2023

Building AI Applications with ChatGPT APIs

Building AI Applications with ChatGPT APIs

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

Modern Data Architecture on AWS

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

TinyML Cookbook