Mastering pandas

Master the features and capabilities of pandas, a data analysis toolkit for Python
Preview in Mapt

Mastering pandas

Femi Anthony

3 customer reviews
Master the features and capabilities of pandas, a data analysis toolkit for Python
Mapt Subscription
FREE
$29.99/m after trial
eBook
$28.00
RRP $39.99
Save 29%
Print + eBook
$49.99
RRP $49.99
What do I get with a Mapt Pro subscription?
  • Unlimited access to all Packt’s 5,000+ eBooks and Videos
  • Early Access content, Progress Tracking, and Assessments
  • 1 Free eBook or Video to download and keep every month after trial
What do I get with an eBook?
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with Print & eBook?
  • Get a paperback copy of the book delivered to you
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with a Video?
  • Download this Video course in MP4 format
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
$0.00
$28.00
$49.99
$29.99p/m after trial
RRP $39.99
RRP $49.99
Subscription
eBook
Print + eBook
Start 30 Day Trial

Frequently bought together


Mastering pandas Book Cover
Mastering pandas
$ 39.99
$ 28.00
Mastering Python Data Analysis with Pandas [Video] Book Cover
Mastering Python Data Analysis with Pandas [Video]
$ 124.99
$ 106.25
Buy 2 for $35.00
Save $129.98
Add to Cart
Subscribe and access every Packt eBook & Video.
 
  • 5,000+ eBooks & Videos
  • 50+ New titles a month
  • 1 Free eBook/Video to keep every month
Start Free Trial
 

Book Details

ISBN 139781783981960
Paperback364 pages

Book Description

Python is a ground breaking language for its simplicity and succinctness, allowing the user to achieve a great deal with a few lines of code, especially compared to other programming languages. The pandas brings these features of Python into the data analysis realm, by providing expressiveness, simplicity, and powerful capabilities for the task of data analysis. By mastering pandas, users will be able to do complex data analysis in a short period of time, as well as illustrate their findings using the rich visualization capabilities of related tools such as IPython and matplotlib.

This book is an in-depth guide to the use of pandas for data analysis, for either the seasoned data analysis practitioner or the novice user. It provides a basic introduction to the pandas framework, and takes users through the installation of the library and the IPython interactive environment. Thereafter, you will learn basic as well as advanced features, such as MultiIndexing, modifying data structures, and sampling data, which provide powerful capabilities for data analysis.

Table of Contents

Chapter 1: Introduction to pandas and Data Analysis
Motivation for data analysis
How Python and pandas fit into the data analytics mix
What is pandas?
Benefits of using pandas
Summary
Chapter 2: Installation of pandas and the Supporting Software
Selecting a version of Python to use
Python installation
Installation of Python and pandas from a third-party vendor
Continuum Analytics Anaconda
Other numeric or analytics-focused Python distributions
Downloading and installing pandas
IPython installation
Summary
Chapter 3: The pandas Data Structures
NumPy ndarrays
Data structures in pandas
Summary
Chapter 4: Operations in pandas, Part I – Indexing and Selecting
Basic indexing
Label, integer, and mixed indexing
Boolean indexing
Summary
Chapter 5: Operations in pandas, Part II – Grouping, Merging, and Reshaping of Data
Grouping of data
Merging and joining
Pivots and reshaping data
Summary
Chapter 6: Missing Data, Time Series, and Plotting Using Matplotlib
Handling missing data
Handling time series
A summary of Time Series-related objects
Summary
Chapter 7: A Tour of Statistics – The Classical Approach
Descriptive statistics versus inferential statistics
Measures of central tendency and variability
Hypothesis testing – the null and alternative hypotheses
Summary
Chapter 8: A Brief Tour of Bayesian Statistics
Introduction to Bayesian statistics
Mathematical framework for Bayesian statistics
Probability distributions
Bayesian statistics versus Frequentist statistics
Conducting Bayesian statistical analysis
Monte Carlo estimation of the likelihood function and PyMC
References
Summary
Chapter 9: The pandas Library Architecture
Introduction to pandas' file hierarchy
Description of pandas' modules and files
Improving performance using Python extensions
Summary
Chapter 10: R and pandas Compared
R data types
Slicing and selection
Arithmetic operations on columns
Aggregation and GroupBy
Comparing matching operators in R and pandas
Logical subsetting
Split-apply-combine
Reshaping using melt
Factors/categorical data
Summary
Chapter 11: Brief Tour of Machine Learning
Role of pandas in machine learning
Installation of scikit-learn
Introduction to machine learning
Application of machine learning – Kaggle Titanic competition
Data analysis and preprocessing using pandas
A naïve approach to Titanic problem
The scikit-learn ML/classifier interface
Supervised learning algorithms
Unsupervised learning algorithms
Summary

What You Will Learn

  • Download, install, and set up Python, pandas, and related tools to perform data analysis for different operating environments
  • Practice using IPython as an interactive environment for doing data analysis using pandas
  • Master the core features of pandas used in data analysis
  • Get to grips with the more advanced features of pandas
  • Understand the basics of using matplotlib to plot data analysis results
  • Analyze real-world datasets using pandas
  • Acquire knowledge of using pandas for basic statistical analysis

Authors

Table of Contents

Chapter 1: Introduction to pandas and Data Analysis
Motivation for data analysis
How Python and pandas fit into the data analytics mix
What is pandas?
Benefits of using pandas
Summary
Chapter 2: Installation of pandas and the Supporting Software
Selecting a version of Python to use
Python installation
Installation of Python and pandas from a third-party vendor
Continuum Analytics Anaconda
Other numeric or analytics-focused Python distributions
Downloading and installing pandas
IPython installation
Summary
Chapter 3: The pandas Data Structures
NumPy ndarrays
Data structures in pandas
Summary
Chapter 4: Operations in pandas, Part I – Indexing and Selecting
Basic indexing
Label, integer, and mixed indexing
Boolean indexing
Summary
Chapter 5: Operations in pandas, Part II – Grouping, Merging, and Reshaping of Data
Grouping of data
Merging and joining
Pivots and reshaping data
Summary
Chapter 6: Missing Data, Time Series, and Plotting Using Matplotlib
Handling missing data
Handling time series
A summary of Time Series-related objects
Summary
Chapter 7: A Tour of Statistics – The Classical Approach
Descriptive statistics versus inferential statistics
Measures of central tendency and variability
Hypothesis testing – the null and alternative hypotheses
Summary
Chapter 8: A Brief Tour of Bayesian Statistics
Introduction to Bayesian statistics
Mathematical framework for Bayesian statistics
Probability distributions
Bayesian statistics versus Frequentist statistics
Conducting Bayesian statistical analysis
Monte Carlo estimation of the likelihood function and PyMC
References
Summary
Chapter 9: The pandas Library Architecture
Introduction to pandas' file hierarchy
Description of pandas' modules and files
Improving performance using Python extensions
Summary
Chapter 10: R and pandas Compared
R data types
Slicing and selection
Arithmetic operations on columns
Aggregation and GroupBy
Comparing matching operators in R and pandas
Logical subsetting
Split-apply-combine
Reshaping using melt
Factors/categorical data
Summary
Chapter 11: Brief Tour of Machine Learning
Role of pandas in machine learning
Installation of scikit-learn
Introduction to machine learning
Application of machine learning – Kaggle Titanic competition
Data analysis and preprocessing using pandas
A naïve approach to Titanic problem
The scikit-learn ML/classifier interface
Supervised learning algorithms
Unsupervised learning algorithms
Summary

Book Details

ISBN 139781783981960
Paperback364 pages
Read More
From 3 reviews

Read More Reviews

Recommended for You

IPython Interactive Computing and Visualization Cookbook Book Cover
IPython Interactive Computing and Visualization Cookbook
$ 29.99
$ 21.00
Python Machine Learning Book Cover
Python Machine Learning
$ 35.99
$ 25.20
Python Data Analysis Book Cover
Python Data Analysis
$ 29.99
$ 21.00
Practical Data Science Cookbook Book Cover
Practical Data Science Cookbook
$ 29.99
$ 21.00
Mastering Object-oriented Python Book Cover
Mastering Object-oriented Python
$ 26.99
$ 18.90
Building Machine Learning Systems with Python Book Cover
Building Machine Learning Systems with Python
$ 29.99
$ 6.00