Learning pandas - Second Edition

Get to grips with pandas—a versatile and high-performance Python library for data manipulation, analysis, and discovery
Preview in Mapt

Learning pandas - Second Edition

Michael Heydt

Get to grips with pandas—a versatile and high-performance Python library for data manipulation, analysis, and discovery
Mapt Subscription
FREE
$29.99/m after trial
eBook
$28.00
RRP $39.99
Save 29%
Print + eBook
$49.99
RRP $49.99
What do I get with a Mapt Pro subscription?
  • Unlimited access to all Packt’s 5,000+ eBooks and Videos
  • Early Access content, Progress Tracking, and Assessments
  • 1 Free eBook or Video to download and keep every month after trial
What do I get with an eBook?
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with Print & eBook?
  • Get a paperback copy of the book delivered to you
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with a Video?
  • Download this Video course in MP4 format
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
$0.00
$28.00
$49.99
$29.99 p/m after trial
RRP $39.99
RRP $49.99
Subscription
eBook
Print + eBook
Start 30 Day Trial

Frequently bought together


Learning pandas - Second Edition Book Cover
Learning pandas - Second Edition
$ 39.99
$ 28.00
Learning Scrapy - Second Edition Book Cover
Learning Scrapy - Second Edition
$ 31.99
$ 22.40
Buy 2 for $35.00
Save $36.98
Add to Cart

Book Details

ISBN 139781787123137
Paperback446 pages

Book Description

You will learn how to use pandas to perform data analysis in Python. You will start with an overview of data analysis and iteratively progress from modeling data, to accessing data from remote sources, performing numeric and statistical analysis, through indexing and performing aggregate analysis, and finally to visualizing statistical data and applying pandas to finance.

With the knowledge you gain from this book, you will quickly learn pandas and how it can empower you in the exciting world of data manipulation, analysis and science.

Table of Contents

Chapter 1: pandas and Data Analysis
Introducing pandas
Data manipulation, analysis, science, and pandas
The process of data analysis
Relating the book to the process
Concepts of data and analysis in our tour of pandas
Other Python libraries of value with pandas
Summary
Chapter 2: Up and Running with pandas
Installation of Anaconda
IPython and Jupyter Notebook
Introducing the pandas Series and DataFrame
Visualization
Summary
Chapter 3: Representing Univariate Data with the Series
Configuring pandas
Creating a Series
The .index and .values properties
The size and shape of a Series
Specifying an index at creation
Heads, tails, and takes
Retrieving values in a Series by label or position
Slicing a Series into subsets
Alignment via index labels
Performing Boolean selection
Re-indexing a Series
Modifying a Series in-place
Summary
Chapter 4: Representing Tabular and Multivariate Data with the DataFrame
Configuring pandas
Creating DataFrame objects
Accessing data within a DataFrame
Selecting rows using Boolean selection
Selecting across both rows and columns
Summary
Chapter 5: Manipulating DataFrame Structure
Configuring pandas
Renaming columns
Adding new columns with [] and .insert()
Adding columns through enlargement
Adding columns using concatenation
Reordering columns
Replacing the contents of a column
Deleting columns
Appending new rows
Concatenating rows
Adding and replacing rows via enlargement
Removing rows using .drop()
Removing rows using Boolean selection
Removing rows using a slice
Summary
Chapter 6: Indexing Data
Configuring pandas
The importance of indexes
The pandas index types
Working with Indexes
Hierarchical indexing
Summary
Chapter 7: Categorical Data
Configuring pandas
Creating Categoricals
Renaming categories
Appending new categories
Removing categories
Removing unused categories
Setting categories
Descriptive information of a Categorical
Munging school grades
Summary
Chapter 8: Numerical and Statistical Methods
Configuring pandas
Performing numerical methods on pandas objects
Performing statistical processes on pandas objects
Summary
Chapter 9: Accessing Data
Configuring pandas
Working with CSV and text/tabular format data
Reading and writing data in Excel format
Reading and writing JSON files
Reading HTML data from the web
Reading and writing HDF5 format files
Accessing CSV data on the web
Reading and writing from/to SQL databases
Reading data from remote data services
Summary
Chapter 10: Tidying Up Your Data
Configuring pandas
What is tidying your data?
How to work with missing data
Handling duplicate data
Transforming data
Summary
Chapter 11: Combining, Relating, and Reshaping Data
Configuring pandas
Concatenating data in multiple objects
Merging and joining data
Pivoting data to and from value and indexes
Stacking and unstacking
Performance benefits of stacked data
Summary
Chapter 12: Data Aggregation
Configuring pandas
The split, apply, and combine (SAC) pattern
Data for the examples
Splitting data
Applying aggregate functions, transforms, and filters
Transforming groups of data
Filtering groups from aggregation
Summary
Chapter 13: Time-Series Modelling
Setting up the IPython notebook
Representation of dates, time, and intervals
Introducing time-series data
Calculating new dates using offsets
Representing durations of time using Period
Handling holidays using calendars
Normalizing timestamps using time zones
Manipulating time-series data
Time-series moving-window operations
Summary
Chapter 14: Visualization
Configuring pandas
Plotting basics with pandas
Creating time-series charts
Common plots used in statistical analyses
Manually rendering multiple plots in a single chart
Summary
Chapter 15: Historical Stock Price Analysis
Setting up the IPython notebook
Obtaining and organizing stock data from Google
Plotting time-series prices
Plotting volume-series data
Calculating the simple daily percentage change in closing price
Calculating simple daily cumulative returns of a stock
Resampling data from daily to monthly returns
Analyzing distribution of returns
Performing a moving-average calculation
Comparison of average daily returns across stocks
Correlation of stocks based on the daily percentage change of the closing price
Calculating the volatility of stocks
Determining risk relative to expected returns
Summary

What You Will Learn

  • Understand how data analysts and scientists think about of the processes of gathering and understanding data
  • Learn how pandas can be used to support the end-to-end process of data analysis
  • Use pandas Series and DataFrame objects to represent single and multivariate data
  • Slicing and dicing data with pandas, as well as combining, grouping, and aggregating data from multiple sources
  • How to access data from external sources such as files, databases, and web services
  • Represent and manipulate time-series data and the many of the intricacies involved with this type of data
  • How to visualize statistical information
  • How to use pandas to solve several common data representation and analysis problems within finance

Authors

Table of Contents

Chapter 1: pandas and Data Analysis
Introducing pandas
Data manipulation, analysis, science, and pandas
The process of data analysis
Relating the book to the process
Concepts of data and analysis in our tour of pandas
Other Python libraries of value with pandas
Summary
Chapter 2: Up and Running with pandas
Installation of Anaconda
IPython and Jupyter Notebook
Introducing the pandas Series and DataFrame
Visualization
Summary
Chapter 3: Representing Univariate Data with the Series
Configuring pandas
Creating a Series
The .index and .values properties
The size and shape of a Series
Specifying an index at creation
Heads, tails, and takes
Retrieving values in a Series by label or position
Slicing a Series into subsets
Alignment via index labels
Performing Boolean selection
Re-indexing a Series
Modifying a Series in-place
Summary
Chapter 4: Representing Tabular and Multivariate Data with the DataFrame
Configuring pandas
Creating DataFrame objects
Accessing data within a DataFrame
Selecting rows using Boolean selection
Selecting across both rows and columns
Summary
Chapter 5: Manipulating DataFrame Structure
Configuring pandas
Renaming columns
Adding new columns with [] and .insert()
Adding columns through enlargement
Adding columns using concatenation
Reordering columns
Replacing the contents of a column
Deleting columns
Appending new rows
Concatenating rows
Adding and replacing rows via enlargement
Removing rows using .drop()
Removing rows using Boolean selection
Removing rows using a slice
Summary
Chapter 6: Indexing Data
Configuring pandas
The importance of indexes
The pandas index types
Working with Indexes
Hierarchical indexing
Summary
Chapter 7: Categorical Data
Configuring pandas
Creating Categoricals
Renaming categories
Appending new categories
Removing categories
Removing unused categories
Setting categories
Descriptive information of a Categorical
Munging school grades
Summary
Chapter 8: Numerical and Statistical Methods
Configuring pandas
Performing numerical methods on pandas objects
Performing statistical processes on pandas objects
Summary
Chapter 9: Accessing Data
Configuring pandas
Working with CSV and text/tabular format data
Reading and writing data in Excel format
Reading and writing JSON files
Reading HTML data from the web
Reading and writing HDF5 format files
Accessing CSV data on the web
Reading and writing from/to SQL databases
Reading data from remote data services
Summary
Chapter 10: Tidying Up Your Data
Configuring pandas
What is tidying your data?
How to work with missing data
Handling duplicate data
Transforming data
Summary
Chapter 11: Combining, Relating, and Reshaping Data
Configuring pandas
Concatenating data in multiple objects
Merging and joining data
Pivoting data to and from value and indexes
Stacking and unstacking
Performance benefits of stacked data
Summary
Chapter 12: Data Aggregation
Configuring pandas
The split, apply, and combine (SAC) pattern
Data for the examples
Splitting data
Applying aggregate functions, transforms, and filters
Transforming groups of data
Filtering groups from aggregation
Summary
Chapter 13: Time-Series Modelling
Setting up the IPython notebook
Representation of dates, time, and intervals
Introducing time-series data
Calculating new dates using offsets
Representing durations of time using Period
Handling holidays using calendars
Normalizing timestamps using time zones
Manipulating time-series data
Time-series moving-window operations
Summary
Chapter 14: Visualization
Configuring pandas
Plotting basics with pandas
Creating time-series charts
Common plots used in statistical analyses
Manually rendering multiple plots in a single chart
Summary
Chapter 15: Historical Stock Price Analysis
Setting up the IPython notebook
Obtaining and organizing stock data from Google
Plotting time-series prices
Plotting volume-series data
Calculating the simple daily percentage change in closing price
Calculating simple daily cumulative returns of a stock
Resampling data from daily to monthly returns
Analyzing distribution of returns
Performing a moving-average calculation
Comparison of average daily returns across stocks
Correlation of stocks based on the daily percentage change of the closing price
Calculating the volatility of stocks
Determining risk relative to expected returns
Summary

Book Details

ISBN 139781787123137
Paperback446 pages
Read More

Read More Reviews

Recommended for You

Python: End-to-end Data Analysis Book Cover
Python: End-to-end Data Analysis
$ 71.99
$ 50.40
Python for Finance - Second Edition Book Cover
Python for Finance - Second Edition
$ 39.99
$ 28.00
Daniel Arbuckle's Mastering Python Book Cover
Daniel Arbuckle's Mastering Python
$ 31.99
$ 22.40
Matplotlib 2.x By Example Book Cover
Matplotlib 2.x By Example
$ 35.99
$ 25.20
Mastering Machine Learning with scikit-learn - Second Edition Book Cover
Mastering Machine Learning with scikit-learn - Second Edition
$ 35.99
$ 25.20
Statistics for Machine Learning Book Cover
Statistics for Machine Learning
$ 39.99
$ 28.00