Python Data Science Cookbook

Over 60 practical recipes to help you explore Python and its robust data science capabilities
Preview in Mapt

Python Data Science Cookbook

Gopi Subramanian

1 customer reviews
Over 60 practical recipes to help you explore Python and its robust data science capabilities

Quick links: > What will you learn?> Table of content> Product reviews

eBook
$5.00
RRP $39.99
Save 87%
Print + eBook
$49.99
RRP $49.99
What do I get with a Mapt Pro subscription?
  • Unlimited access to all Packt’s 5,000+ eBooks and Videos
  • Early Access content, Progress Tracking, and Assessments
  • 1 Free eBook or Video to download and keep every month after trial
What do I get with an eBook?
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with Print & eBook?
  • Get a paperback copy of the book delivered to you
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with a Video?
  • Download this Video course in MP4 format
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
$5.00
$49.99
RRP $39.99
RRP $49.99
eBook
Print + eBook

Frequently bought together


Python Data Science Cookbook Book Cover
Python Data Science Cookbook
$ 39.99
$ 5.00
Python Data Analysis Cookbook Book Cover
Python Data Analysis Cookbook
$ 39.99
$ 5.00
Buy 2 for $10.00
Save $69.98
Add to Cart

Book Details

ISBN 139781784396404
Paperback438 pages

Book Description

Python is increasingly becoming the language for data science. It is overtaking R in terms of adoption, it is widely known by many developers, and has a strong set of libraries such as Numpy, Pandas, scikit-learn, Matplotlib, Ipython and Scipy, to support its usage in this field. Data Science is the emerging new hot tech field, which is an amalgamation of different disciplines including statistics, machine learning, and computer science. It’s a disruptive technology changing the face of today’s business and altering the economy of various verticals including retail, manufacturing, online ventures, and hospitality, to name a few, in a big way.

This book will walk you through the various steps, starting from simple to the most complex algorithms available in the Data Science arsenal, to effectively mine data and derive intelligence from it. At every step, we provide simple and efficient Python recipes that will not only show you how to implement these algorithms, but also clarify the underlying concept thoroughly.

The book begins by introducing you to using Python for Data Science, followed by working with Python environments. You will then learn how to analyse your data with Python. The book then teaches you the concepts of data mining followed by an extensive coverage of machine learning methods. It introduces you to a number of Python libraries available to help implement machine learning and data mining routines effectively. It also covers the principles of shrinkage, ensemble methods, random forest, rotation forest, and extreme trees, which are a must-have for any successful Data Science Professional.

Table of Contents

Chapter 1: Python for Data Science
Introduction
Using dictionary objects
Working with a dictionary of dictionaries
Working with tuples
Using sets
Writing a list
Creating a list from another list - list comprehension
Using iterators
Generating an iterator and a generator
Using iterables
Passing a function as a variable
Embedding functions in another function
Passing a function as a parameter
Returning a function
Altering the function behavior with decorators
Creating anonymous functions with lambda
Using the map function
Working with filters
Using zip and izip
Processing arrays from the tabular data
Preprocessing the columns
Sorting lists
Sorting with a key
Working with itertools
Chapter 2: Python Environments
Introduction
Using NumPy libraries
Plotting with matplotlib
Machine learning with scikit-learn
Chapter 3: Data Analysis – Explore and Wrangle
Introduction
Analyzing univariate data graphically
Grouping the data and using dot plots
Using scatter plots for multivariate data
Using heat maps
Performing summary statistics and plots
Using a box-and-whisker plot
Imputing the data
Performing random sampling
Scaling the data
Standardizing the data
Performing tokenization
Removing stop words
Stemming the words
Performing word lemmatization
Representing the text as a bag of words
Calculating term frequencies and inverse document frequencies
Chapter 4: Data Analysis – Deep Dive
Introduction
Extracting the principal components
Using Kernel PCA
Extracting features using singular value decomposition
Reducing the data dimension with random projection
Decomposing the feature matrices using non-negative matrix factorization
Chapter 5: Data Mining – Needle in a Haystack
Introduction
Working with distance measures
Learning and using kernel methods
Clustering data using the k-means method
Learning vector quantization
Finding outliers in univariate data
Discovering outliers using the local outlier factor method
Chapter 6: Machine Learning 1
Introduction
Preparing data for model building
Finding the nearest neighbors
Classifying documents using Naïve Bayes
Building decision trees to solve multiclass problems
Chapter 7: Machine Learning 2
Introduction
Predicting real-valued numbers using regression
Learning regression with L2 shrinkage – ridge
Learning regression with L1 shrinkage – LASSO
Using cross-validation iterators with L1 and L2 shrinkage
Chapter 8: Ensemble Methods
Introduction
Understanding Ensemble – Bagging Method
Understanding Ensemble – Boosting Method
Understanding Ensemble – Gradient Boosting
Chapter 9: Growing Trees
Introduction
Going from trees to Forest – Random Forest
Growing Extremely Randomized Trees
Growing Rotational Forest
Chapter 10: Large-Scale Machine Learning – Online Learning
Introduction
Using perceptron as an online learning algorithm
Using stochastic gradient descent for regression
Using stochastic gradient descent for classification

What You Will Learn

  • Explore the complete range of Data Science algorithms
  • Get to know the tricks used by industry engineers to create the most accurate data science models
  • Manage and use Python libraries such as numpy, scipy, scikit learn, and matplotlib effectively
  • Create meaningful features to solve real-world problems
  • Take a look at Advanced Regression methods for model building and variable selection
  • Get a thorough understanding of the underlying concepts and implementation of Ensemble methods
  • Solve real-world problems using a variety of different datasets from numerical and text data modalities
  • Get accustomed to modern state-of-the art algorithms such as Gradient Boosting, Random Forest, Rotation Forest, and so on

Authors

Table of Contents

Chapter 1: Python for Data Science
Introduction
Using dictionary objects
Working with a dictionary of dictionaries
Working with tuples
Using sets
Writing a list
Creating a list from another list - list comprehension
Using iterators
Generating an iterator and a generator
Using iterables
Passing a function as a variable
Embedding functions in another function
Passing a function as a parameter
Returning a function
Altering the function behavior with decorators
Creating anonymous functions with lambda
Using the map function
Working with filters
Using zip and izip
Processing arrays from the tabular data
Preprocessing the columns
Sorting lists
Sorting with a key
Working with itertools
Chapter 2: Python Environments
Introduction
Using NumPy libraries
Plotting with matplotlib
Machine learning with scikit-learn
Chapter 3: Data Analysis – Explore and Wrangle
Introduction
Analyzing univariate data graphically
Grouping the data and using dot plots
Using scatter plots for multivariate data
Using heat maps
Performing summary statistics and plots
Using a box-and-whisker plot
Imputing the data
Performing random sampling
Scaling the data
Standardizing the data
Performing tokenization
Removing stop words
Stemming the words
Performing word lemmatization
Representing the text as a bag of words
Calculating term frequencies and inverse document frequencies
Chapter 4: Data Analysis – Deep Dive
Introduction
Extracting the principal components
Using Kernel PCA
Extracting features using singular value decomposition
Reducing the data dimension with random projection
Decomposing the feature matrices using non-negative matrix factorization
Chapter 5: Data Mining – Needle in a Haystack
Introduction
Working with distance measures
Learning and using kernel methods
Clustering data using the k-means method
Learning vector quantization
Finding outliers in univariate data
Discovering outliers using the local outlier factor method
Chapter 6: Machine Learning 1
Introduction
Preparing data for model building
Finding the nearest neighbors
Classifying documents using Naïve Bayes
Building decision trees to solve multiclass problems
Chapter 7: Machine Learning 2
Introduction
Predicting real-valued numbers using regression
Learning regression with L2 shrinkage – ridge
Learning regression with L1 shrinkage – LASSO
Using cross-validation iterators with L1 and L2 shrinkage
Chapter 8: Ensemble Methods
Introduction
Understanding Ensemble – Bagging Method
Understanding Ensemble – Boosting Method
Understanding Ensemble – Gradient Boosting
Chapter 9: Growing Trees
Introduction
Going from trees to Forest – Random Forest
Growing Extremely Randomized Trees
Growing Rotational Forest
Chapter 10: Large-Scale Machine Learning – Online Learning
Introduction
Using perceptron as an online learning algorithm
Using stochastic gradient descent for regression
Using stochastic gradient descent for classification

Book Details

ISBN 139781784396404
Paperback438 pages
Read More
From 1 reviews

Read More Reviews

Recommended for You

Python Data Analysis Cookbook Book Cover
Python Data Analysis Cookbook
$ 39.99
$ 5.00
Practical Machine Learning Book Cover
Practical Machine Learning
$ 37.99
$ 5.00
Python Machine Learning Cookbook Book Cover
Python Machine Learning Cookbook
$ 47.99
$ 5.00
Python: Real-World Data Science Book Cover
Python: Real-World Data Science
$ 59.99
$ 5.00
Mastering Data Mining with Python - Find patterns hidden in your data Book Cover
Mastering Data Mining with Python - Find patterns hidden in your data
$ 39.99
$ 5.00
Modern Python Cookbook Book Cover
Modern Python Cookbook
$ 39.99
$ 5.00