Python: Real World Machine Learning

Learn to solve challenging data science problems by building powerful machine learning models using Python

Python: Real World Machine Learning

Learning
Prateek Joshi et al.

2 customer reviews
Learn to solve challenging data science problems by building powerful machine learning models using Python
$69.99
RRP $69.99
eBook

Instantly access this course right now and get the skills you need in 2017

With unlimited access to a constantly growing library of over 4,000 eBooks and Videos, a subscription to Mapt gives you everything you need to learn new skills. Cancel anytime.

Preview in Mapt

Book Details

ISBN 139781787123212
Paperback941 pages

Book Description

Machine learning is increasingly spreading in the modern data-driven world. It is used extensively across many fields such as search engines, robotics, self-driving cars, and more. Machine learning is transforming the way we understand and interact with the world around us.

In the first module, Python Machine Learning Cookbook, you will learn how to perform various machine learning tasks using a wide variety of machine learning algorithms to solve real-world problems and use Python to implement these algorithms.

The second module, Advanced Machine Learning with Python, is designed to take you on a guided tour of the most relevant and powerful machine learning techniques and you’ll acquire a broad set of powerful skills in the area of feature selection and feature engineering.

The third module in this learning path, Large Scale Machine Learning with Python, dives into scalable machine learning and the three forms of scalability. It covers the most effective machine learning techniques on a map reduce framework in Hadoop and Spark in Python.

This Learning Path will teach you Python machine learning for the real world. The machine learning techniques covered in this Learning Path are at the forefront of commercial practice.

This Learning Path combines some of the best that Packt has to offer in one complete, curated package. It includes content from the following Packt products:

Table of Contents

Chapter 1: The Realm of Supervised Learning
Introduction
Preprocessing data using different techniques
Label encoding
Building a linear regressor
Computing regression accuracy
Achieving model persistence
Building a ridge regressor
Building a polynomial regressor
Estimating housing prices
Computing the relative importance of features
Estimating bicycle demand distribution
Chapter 2: Constructing a Classifier
Introduction
Building a simple classifier
Building a logistic regression classifier
Building a Naive Bayes classifier
Splitting the dataset for training and testing
Evaluating the accuracy using cross-validation
Visualizing the confusion matrix
Extracting the performance report
Evaluating cars based on their characteristics
Extracting validation curves
Extracting learning curves
Estimating the income bracket
Chapter 3: Predictive Modeling
Introduction
Building a linear classifier using Support Vector Machine (SVMs)
Building a nonlinear classifier using SVMs
Tackling class imbalance
Extracting confidence measurements
Finding optimal hyperparameters
Building an event predictor
Estimating traffic
Chapter 4: Clustering with Unsupervised Learning
Introduction
Clustering data using the k-means algorithm
Compressing an image using vector quantization
Building a Mean Shift clustering model
Grouping data using agglomerative clustering
Evaluating the performance of clustering algorithms
Automatically estimating the number of clusters using DBSCAN algorithm
Finding patterns in stock market data
Building a customer segmentation model
Chapter 5: Building Recommendation Engines
Introduction
Building function compositions for data processing
Building machine learning pipelines
Finding the nearest neighbors
Constructing a k-nearest neighbors classifier
Constructing a k-nearest neighbors regressor
Computing the Euclidean distance score
Computing the Pearson correlation score
Finding similar users in the dataset
Generating movie recommendations
Chapter 6: Analyzing Text Data
Introduction
Preprocessing data using tokenization
Stemming text data
Converting text to its base form using lemmatization
Dividing text using chunking
Building a bag-of-words model
Building a text classifier
Identifying the gender
Analyzing the sentiment of a sentence
Identifying patterns in text using topic modeling
Chapter 7: Speech Recognition
Introduction
Reading and plotting audio data
Transforming audio signals into the frequency domain
Generating audio signals with custom parameters
Synthesizing music
Extracting frequency domain features
Building Hidden Markov Models
Building a speech recognizer
Chapter 8: Dissecting Time Series and Sequential Data
Introduction
Transforming data into the time series format
Slicing time series data
Operating on time series data
Extracting statistics from time series data
Building Hidden Markov Models for sequential data
Building Conditional Random Fields for sequential text data
Analyzing stock market data using Hidden Markov Models
Chapter 9: Image Content Analysis
Introduction
Operating on images using OpenCV-Python
Detecting edges
Histogram equalization
Detecting corners
Detecting SIFT feature points
Building a Star feature detector
Creating features using visual codebook and vector quantization
Training an image classifier using Extremely Random Forests
Building an object recognizer
Chapter 10: Biometric Face Recognition
Introduction
Capturing and processing video from a webcam
Building a face detector using Haar cascades
Building eye and nose detectors
Performing Principal Components Analysis
Performing Kernel Principal Components Analysis
Performing blind source separation
Building a face recognizer using Local Binary Patterns Histogram
Chapter 11: Deep Neural Networks
Introduction
Building a perceptron
Building a single layer neural network
Building a deep neural network
Creating a vector quantizer
Building a recurrent neural network for sequential data analysis
Visualizing the characters in an optical character recognition database
Building an optical character recognizer using neural networks
Chapter 12: Visualizing Data
Introduction
Plotting 3D scatter plots
Plotting bubble plots
Animating bubble plots
Drawing pie charts
Plotting date-formatted time series data
Plotting histograms
Visualizing heat maps
Animating dynamic signals
Chapter 13: Unsupervised Machine Learning
Principal component analysis
Introducing k-means clustering
Self-organizing maps
Further reading
Summary
Chapter 14: Deep Belief Networks
Neural networks – a primer
Restricted Boltzmann Machine
Deep belief networks
Further reading
Summary
Chapter 15: Stacked Denoising Autoencoders
Autoencoders
Stacked Denoising Autoencoders
Further reading
Summary
Chapter 16: Convolutional Neural Networks
Introducing the CNN
Further Reading
Summary
Chapter 17: Semi-Supervised Learning
Introduction
Understanding semi-supervised learning
Semi-supervised algorithms in action
Further reading
Summary
Chapter 18: Text Feature Engineering
Introduction
Text feature engineering
Further reading
Summary
Chapter 19: Feature Engineering Part II
Introduction
Creating a feature set
Feature engineering in practice
Further reading
Summary
Chapter 20: Ensemble Methods
Introducing ensembles
Using models in dynamic applications
Further reading
Summary
Chapter 21: Additional Python Machine Learning Tools
Alternative development tools
Further reading
Summary
Chapter 22: First Steps to Scalability
Explaining scalability in detail
Python for large scale machine learning
Python packages
Summary
Chapter 23: Scalable Learning in Scikit-learn
Out-of-core learning
Streaming data from sources
Stochastic learning
Feature management with data streams
Summary
Chapter 24: Fast SVM Implementations
Datasets to experiment with on your own
Support Vector Machines
Feature selection by regularization
Including non-linearity in SGD
Hyperparameter tuning
Summary
Chapter 25: Neural Networks and Deep Learning
The neural network architecture
Neural networks and regularization
Neural networks and hyperparameter optimization
Neural networks and decision boundaries
Deep learning at scale with H2O
Deep learning and unsupervised pretraining
Deep learning with theanets
Autoencoders and unsupervised learning
Summary
Chapter 26: Deep Learning with TensorFlow
TensorFlow installation
Machine learning on TensorFlow with SkFlow
Keras and TensorFlow installation
Convolutional Neural Networks in TensorFlow through Keras
CNN's with an incremental approach
GPU Computing
Summary
Chapter 27: Classification and Regression Trees at Scale
Bootstrap aggregation
Random forest and extremely randomized forest
Fast parameter optimization with randomized search
CART and boosting
XGBoost
Out-of-core CART with H2O
Summary
Chapter 28: Unsupervised Learning at Scale
Unsupervised methods
Feature decomposition – PCA
PCA with H2O
Clustering – K-means
K-means with H2O
LDA
Summary
Chapter 29: Distributed Environments – Hadoop and Spark
From a standalone machine to a bunch of nodes
Setting up the VM
The Hadoop ecosystem
Spark
Summary
Chapter 30: Practical Machine Learning with Spark
Setting up the VM for this chapter
Sharing variables across cluster nodes
Data preprocessing in Spark
Machine learning with Spark
Summary

What You Will Learn

  • Use predictive modeling and apply it to real-world problems
  • Understand how to perform market segmentation using unsupervised learning
  • Apply your new-found skills to solve real problems, through clearly-explained code for every technique and test
  • Compete with top data scientists by gaining a practical and theoretical understanding of cutting-edge deep learning algorithms
  • Increase predictive accuracy with deep learning and scalable data-handling techniques
  • Work with modern state-of-the-art large-scale machine learning techniques
  • Learn to use Python code to implement a range of machine learning algorithms and techniques

Authors

Table of Contents

Chapter 1: The Realm of Supervised Learning
Introduction
Preprocessing data using different techniques
Label encoding
Building a linear regressor
Computing regression accuracy
Achieving model persistence
Building a ridge regressor
Building a polynomial regressor
Estimating housing prices
Computing the relative importance of features
Estimating bicycle demand distribution
Chapter 2: Constructing a Classifier
Introduction
Building a simple classifier
Building a logistic regression classifier
Building a Naive Bayes classifier
Splitting the dataset for training and testing
Evaluating the accuracy using cross-validation
Visualizing the confusion matrix
Extracting the performance report
Evaluating cars based on their characteristics
Extracting validation curves
Extracting learning curves
Estimating the income bracket
Chapter 3: Predictive Modeling
Introduction
Building a linear classifier using Support Vector Machine (SVMs)
Building a nonlinear classifier using SVMs
Tackling class imbalance
Extracting confidence measurements
Finding optimal hyperparameters
Building an event predictor
Estimating traffic
Chapter 4: Clustering with Unsupervised Learning
Introduction
Clustering data using the k-means algorithm
Compressing an image using vector quantization
Building a Mean Shift clustering model
Grouping data using agglomerative clustering
Evaluating the performance of clustering algorithms
Automatically estimating the number of clusters using DBSCAN algorithm
Finding patterns in stock market data
Building a customer segmentation model
Chapter 5: Building Recommendation Engines
Introduction
Building function compositions for data processing
Building machine learning pipelines
Finding the nearest neighbors
Constructing a k-nearest neighbors classifier
Constructing a k-nearest neighbors regressor
Computing the Euclidean distance score
Computing the Pearson correlation score
Finding similar users in the dataset
Generating movie recommendations
Chapter 6: Analyzing Text Data
Introduction
Preprocessing data using tokenization
Stemming text data
Converting text to its base form using lemmatization
Dividing text using chunking
Building a bag-of-words model
Building a text classifier
Identifying the gender
Analyzing the sentiment of a sentence
Identifying patterns in text using topic modeling
Chapter 7: Speech Recognition
Introduction
Reading and plotting audio data
Transforming audio signals into the frequency domain
Generating audio signals with custom parameters
Synthesizing music
Extracting frequency domain features
Building Hidden Markov Models
Building a speech recognizer
Chapter 8: Dissecting Time Series and Sequential Data
Introduction
Transforming data into the time series format
Slicing time series data
Operating on time series data
Extracting statistics from time series data
Building Hidden Markov Models for sequential data
Building Conditional Random Fields for sequential text data
Analyzing stock market data using Hidden Markov Models
Chapter 9: Image Content Analysis
Introduction
Operating on images using OpenCV-Python
Detecting edges
Histogram equalization
Detecting corners
Detecting SIFT feature points
Building a Star feature detector
Creating features using visual codebook and vector quantization
Training an image classifier using Extremely Random Forests
Building an object recognizer
Chapter 10: Biometric Face Recognition
Introduction
Capturing and processing video from a webcam
Building a face detector using Haar cascades
Building eye and nose detectors
Performing Principal Components Analysis
Performing Kernel Principal Components Analysis
Performing blind source separation
Building a face recognizer using Local Binary Patterns Histogram
Chapter 11: Deep Neural Networks
Introduction
Building a perceptron
Building a single layer neural network
Building a deep neural network
Creating a vector quantizer
Building a recurrent neural network for sequential data analysis
Visualizing the characters in an optical character recognition database
Building an optical character recognizer using neural networks
Chapter 12: Visualizing Data
Introduction
Plotting 3D scatter plots
Plotting bubble plots
Animating bubble plots
Drawing pie charts
Plotting date-formatted time series data
Plotting histograms
Visualizing heat maps
Animating dynamic signals
Chapter 13: Unsupervised Machine Learning
Principal component analysis
Introducing k-means clustering
Self-organizing maps
Further reading
Summary
Chapter 14: Deep Belief Networks
Neural networks – a primer
Restricted Boltzmann Machine
Deep belief networks
Further reading
Summary
Chapter 15: Stacked Denoising Autoencoders
Autoencoders
Stacked Denoising Autoencoders
Further reading
Summary
Chapter 16: Convolutional Neural Networks
Introducing the CNN
Further Reading
Summary
Chapter 17: Semi-Supervised Learning
Introduction
Understanding semi-supervised learning
Semi-supervised algorithms in action
Further reading
Summary
Chapter 18: Text Feature Engineering
Introduction
Text feature engineering
Further reading
Summary
Chapter 19: Feature Engineering Part II
Introduction
Creating a feature set
Feature engineering in practice
Further reading
Summary
Chapter 20: Ensemble Methods
Introducing ensembles
Using models in dynamic applications
Further reading
Summary
Chapter 21: Additional Python Machine Learning Tools
Alternative development tools
Further reading
Summary
Chapter 22: First Steps to Scalability
Explaining scalability in detail
Python for large scale machine learning
Python packages
Summary
Chapter 23: Scalable Learning in Scikit-learn
Out-of-core learning
Streaming data from sources
Stochastic learning
Feature management with data streams
Summary
Chapter 24: Fast SVM Implementations
Datasets to experiment with on your own
Support Vector Machines
Feature selection by regularization
Including non-linearity in SGD
Hyperparameter tuning
Summary
Chapter 25: Neural Networks and Deep Learning
The neural network architecture
Neural networks and regularization
Neural networks and hyperparameter optimization
Neural networks and decision boundaries
Deep learning at scale with H2O
Deep learning and unsupervised pretraining
Deep learning with theanets
Autoencoders and unsupervised learning
Summary
Chapter 26: Deep Learning with TensorFlow
TensorFlow installation
Machine learning on TensorFlow with SkFlow
Keras and TensorFlow installation
Convolutional Neural Networks in TensorFlow through Keras
CNN's with an incremental approach
GPU Computing
Summary
Chapter 27: Classification and Regression Trees at Scale
Bootstrap aggregation
Random forest and extremely randomized forest
Fast parameter optimization with randomized search
CART and boosting
XGBoost
Out-of-core CART with H2O
Summary
Chapter 28: Unsupervised Learning at Scale
Unsupervised methods
Feature decomposition – PCA
PCA with H2O
Clustering – K-means
K-means with H2O
LDA
Summary
Chapter 29: Distributed Environments – Hadoop and Spark
From a standalone machine to a bunch of nodes
Setting up the VM
The Hadoop ecosystem
Spark
Summary
Chapter 30: Practical Machine Learning with Spark
Setting up the VM for this chapter
Sharing variables across cluster nodes
Data preprocessing in Spark
Machine learning with Spark
Summary

Book Details

ISBN 139781787123212
Paperback941 pages
Read More
From 2 reviews

Read More Reviews