Python: Real-World Data Science

Unleash the power of Python and its robust data science capabilities
Preview in Mapt
Code Files

Python: Real-World Data Science

Dusty Phillips et al.

2 customer reviews
Unleash the power of Python and its robust data science capabilities
Mapt Subscription
FREE
$29.99/m after trial
eBook
$10.00
RRP $59.99
Save 83%
What do I get with a Mapt Pro subscription?
  • Unlimited access to all Packt’s 5,000+ eBooks and Videos
  • Early Access content, Progress Tracking, and Assessments
  • 1 Free eBook or Video to download and keep every month after trial
What do I get with an eBook?
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with Print & eBook?
  • Get a paperback copy of the book delivered to you
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with a Video?
  • Download this Video course in MP4 format
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
$0.00
$10.00
$29.99 p/m after trial
RRP $59.99
Subscription
eBook
Start 30 Day Trial

Frequently bought together


Python: Real-World Data Science Book Cover
Python: Real-World Data Science
$ 59.99
$ 10.00
Real-World Data Science with Spark 2 Book Cover
Real-World Data Science with Spark 2
$ 124.99
$ 10.00
Buy 2 for $20.00
Save $164.98
Add to Cart

Book Details

ISBN 139781786465160
Paperback1255 pages

Book Description

The Python: Real-World Data Science course will take you on a journey to become an efficient data science practitioner by thoroughly understanding the key concepts of Python. This learning path is divided into four modules and each module are a mini course in their own right, and as you complete each one, you’ll have gained key skills and be ready for the material in the next module.

The course begins with getting your Python fundamentals nailed down. After getting familiar with Python core concepts, it’s time that you dive into the field of data science. In the second module, you'll learn how to perform data analysis using Python in a practical and example-driven way. The third module will teach you how to design and develop data mining applications using a variety of datasets, starting with basic classification and affinity analysis to more complex data types including text, images, and graphs. Machine learning and predictive analytics have become the most important approaches to uncover data gold mines. In the final module, we'll discuss the necessary details regarding machine learning concepts, offering intuitive yet informative explanations on how machine learning algorithms work, how to use them, and most importantly, how to avoid the common pitfalls.

Table of Contents

Chapter 1: Introduction and First Steps – Take a Deep Breath
A proper introduction
Enter the Python
About Python
What are the drawbacks?
Who is using Python today?
Setting up the environment
What you need for this course
How you can run a Python program
How is Python code organized
Python's execution model
Guidelines on how to write good code
The Python culture
A note on the IDEs
Chapter 2: Object-oriented Design
Introducing object-oriented
Objects and classes
Specifying attributes and behaviors
Hiding details and creating the public interface
Composition
Inheritance
Case study
Chapter 3: Objects in Python
Creating Python classes
Modules and packages
Organizing module contents
Who can access my data?
Third-party libraries
Case study
Chapter 4: When Objects Are Alike
Basic inheritance
Multiple inheritance
Polymorphism
Abstract base classes
Case study
Chapter 5: Expecting the Unexpected
Raising exceptions
Case study
Chapter 6: When to Use Object-oriented Programming
Treat objects as objects
Adding behavior to class data with properties
Manager objects
Case study
Chapter 7: Python Data Structures
Empty objects
Tuples and named tuples
Dictionaries
Lists
Sets
Extending built-ins
Queues
Case study
Chapter 8: Python Object-oriented Shortcuts
Python built-in functions
An alternative to method overloading
Functions are objects too
Case study
Chapter 9: Strings and Serialization
Strings
Regular expressions
Serializing objects
Case study
Chapter 10: The Iterator Pattern
Design patterns in brief
Iterators
Comprehensions
Generators
Coroutines
Case study
Chapter 11: Python Design Patterns I
The decorator pattern
The observer pattern
The strategy pattern
The state pattern
The singleton pattern
The template pattern
Chapter 12: Python Design Patterns II
The adapter pattern
The facade pattern
The flyweight pattern
The command pattern
The abstract factory pattern
The composite pattern
Chapter 13: Testing Object-oriented Programs
Why test?
Unit testing
Testing with py.test
Imitating expensive objects
How much testing is enough?
Case study
Chapter 14: Concurrency
Threads
Multiprocessing
Futures
AsyncIO
Case study
Chapter 15: Introducing Data Analysis and Libraries
Data analysis and processing
An overview of the libraries in data analysis
Python libraries in data analysis
Chapter 16: NumPy Arrays and Vectorized Computation
NumPy arrays
Array functions
Data processing using arrays
Linear algebra with NumPy
NumPy random numbers
Chapter 17: Data Analysis with pandas
An overview of the pandas package
The pandas data structure
The essential basic functionality
Indexing and selecting data
Computational tools
Working with missing data
Advanced uses of pandas for data analysis
Chapter 18: Data Visualization
The matplotlib API primer
Exploring plot types
Legends and annotations
Plotting functions with pandas
Additional Python data visualization tools
Chapter 19: Time Series
Time series primer
Working with date and time objects
Resampling time series
Downsampling time series data
Upsampling time series data
Timedeltas
Time series plotting
Chapter 20: Interacting with Databases
Interacting with data in text format
Interacting with data in binary format
Interacting with data in MongoDB
Interacting with data in Redis
Chapter 21: Data Analysis Application Examples
Data munging
Data aggregation
Grouping data
Chapter 22: Getting Started with Data Mining
Introducing data mining
A simple affinity analysis example
A simple classification example
What is classification?
Chapter 23: Classifying with scikit-learn Estimators
scikit-learn estimators
Preprocessing using pipelines
Pipelines
Chapter 24: Predicting Sports Winners with Decision Trees
Loading the dataset
Decision trees
Sports outcome prediction
Random forests
Chapter 25: Recommending Movies Using Affinity Analysis
Affinity analysis
The movie recommendation problem
The Apriori implementation
Extracting association rules
Chapter 26: Extracting Features with Transformers
Feature extraction
Feature selection
Feature creation
Creating your own transformer
Chapter 27: Social Media Insight Using Naive Bayes
Disambiguation
Text transformers
Naive Bayes
Application
Chapter 28: Discovering Accounts to Follow Using Graph Mining
Loading the dataset
Finding subgraphs
Chapter 29: Beating CAPTCHAs with Neural Networks
Artificial neural networks
Creating the dataset
Training and classifying
Improving accuracy using a dictionary
Chapter 30: Authorship Attribution
Attributing documents to authors
Function words
Support vector machines
Character n-grams
Using the Enron dataset
Chapter 31: Clustering News Articles
Obtaining news articles
Extracting text from arbitrary websites
Grouping news articles
Clustering ensembles
Online learning
Chapter 32: Classifying Objects in Images Using Deep Learning
Object classification
Application scenario and goals
Deep neural networks
GPU optimization
Setting up the environment
Application
Chapter 33: Working with Big Data
Big data
Application scenario and goals
MapReduce
Application
Chapter 34: Next Steps…
Chapter 1 – Getting Started with Data Mining
Chapter 2 – Classifying with scikit-learn Estimators
Chapter 3: Predicting Sports Winners with Decision Trees
Chapter 4 – Recommending Movies Using Affinity Analysis
Chapter 5 – Extracting Features with Transformers
Chapter 6 – Social Media Insight Using Naive Bayes
Chapter 7 – Discovering Accounts to Follow Using Graph Mining
Chapter 8 – Beating CAPTCHAs with Neural Networks
Chapter 9 – Authorship Attribution
Chapter 10 – Clustering News Articles
Chapter 11 – Classifying Objects in Images Using Deep Learning
Chapter 12 – Working with Big Data
More resources
Chapter 35: Giving Computers the Ability to Learn from Data
How to transform data into knowledge
The three different types of machine learning
An introduction to the basic terminology and notations
A roadmap for building machine learning systems
Using Python for machine learning
Chapter 36: Training Machine Learning Algorithms for Classification
Artificial neurons – a brief glimpse into the early history of machine learning
Implementing a perceptron learning algorithm in Python
Adaptive linear neurons and the convergence of learning
Chapter 37: A Tour of Machine Learning Classifiers Using scikit-learn
Choosing a classification algorithm
First steps with scikit-learn
Modeling class probabilities via logistic regression
Maximum margin classification with support vector machines
Solving nonlinear problems using a kernel SVM
Decision tree learning
K-nearest neighbors – a lazy learning algorithm
Chapter 38: Building Good Training Sets – Data Preprocessing
Dealing with missing data
Handling categorical data
Partitioning a dataset in training and test sets
Bringing features onto the same scale
Selecting meaningful features
Assessing feature importance with random forests
Chapter 39: Compressing Data via Dimensionality Reduction
Unsupervised dimensionality reduction via principal component analysis
Supervised data compression via linear discriminant analysis
Using kernel principal component analysis for nonlinear mappings
Chapter 40: Learning Best Practices for Model Evaluation and Hyperparameter Tuning
Streamlining workflows with pipelines
Using k-fold cross-validation to assess model performance
Debugging algorithms with learning and validation curves
Fine-tuning machine learning models via grid search
Looking at different performance evaluation metrics
Chapter 41: Combining Different Models for Ensemble Learning
Learning with ensembles
Implementing a simple majority vote classifier
Evaluating and tuning the ensemble classifier
Bagging – building an ensemble of classifiers from bootstrap samples
Leveraging weak learners via adaptive boosting
Chapter 42: Predicting Continuous Target Variables with Regression Analysis
Introducing a simple linear regression model
Exploring the Housing Dataset
Implementing an ordinary least squares linear regression model
Fitting a robust regression model using RANSAC
Evaluating the performance of linear regression models
Using regularized methods for regression
Turning a linear regression model into a curve – polynomial regression

What You Will Learn

  • Install and setup Python
  • Implement objects in Python by creating classes and defining methods
  • Get acquainted with NumPy to use it with arrays and array-oriented computing in data analysis
  • Create effective visualizations for presenting your data using Matplotlib
  • Process and analyze data using the time series capabilities of pandas
  • Interact with different kind of database systems, such as file, disk format, Mongo, and Redis
  • Apply data mining concepts to real-world problems
  • Compute on big data, including real-time data from the Internet
  • Explore how to use different machine learning models to ask different questions of your data

Authors

Table of Contents

Chapter 1: Introduction and First Steps – Take a Deep Breath
A proper introduction
Enter the Python
About Python
What are the drawbacks?
Who is using Python today?
Setting up the environment
What you need for this course
How you can run a Python program
How is Python code organized
Python's execution model
Guidelines on how to write good code
The Python culture
A note on the IDEs
Chapter 2: Object-oriented Design
Introducing object-oriented
Objects and classes
Specifying attributes and behaviors
Hiding details and creating the public interface
Composition
Inheritance
Case study
Chapter 3: Objects in Python
Creating Python classes
Modules and packages
Organizing module contents
Who can access my data?
Third-party libraries
Case study
Chapter 4: When Objects Are Alike
Basic inheritance
Multiple inheritance
Polymorphism
Abstract base classes
Case study
Chapter 5: Expecting the Unexpected
Raising exceptions
Case study
Chapter 6: When to Use Object-oriented Programming
Treat objects as objects
Adding behavior to class data with properties
Manager objects
Case study
Chapter 7: Python Data Structures
Empty objects
Tuples and named tuples
Dictionaries
Lists
Sets
Extending built-ins
Queues
Case study
Chapter 8: Python Object-oriented Shortcuts
Python built-in functions
An alternative to method overloading
Functions are objects too
Case study
Chapter 9: Strings and Serialization
Strings
Regular expressions
Serializing objects
Case study
Chapter 10: The Iterator Pattern
Design patterns in brief
Iterators
Comprehensions
Generators
Coroutines
Case study
Chapter 11: Python Design Patterns I
The decorator pattern
The observer pattern
The strategy pattern
The state pattern
The singleton pattern
The template pattern
Chapter 12: Python Design Patterns II
The adapter pattern
The facade pattern
The flyweight pattern
The command pattern
The abstract factory pattern
The composite pattern
Chapter 13: Testing Object-oriented Programs
Why test?
Unit testing
Testing with py.test
Imitating expensive objects
How much testing is enough?
Case study
Chapter 14: Concurrency
Threads
Multiprocessing
Futures
AsyncIO
Case study
Chapter 15: Introducing Data Analysis and Libraries
Data analysis and processing
An overview of the libraries in data analysis
Python libraries in data analysis
Chapter 16: NumPy Arrays and Vectorized Computation
NumPy arrays
Array functions
Data processing using arrays
Linear algebra with NumPy
NumPy random numbers
Chapter 17: Data Analysis with pandas
An overview of the pandas package
The pandas data structure
The essential basic functionality
Indexing and selecting data
Computational tools
Working with missing data
Advanced uses of pandas for data analysis
Chapter 18: Data Visualization
The matplotlib API primer
Exploring plot types
Legends and annotations
Plotting functions with pandas
Additional Python data visualization tools
Chapter 19: Time Series
Time series primer
Working with date and time objects
Resampling time series
Downsampling time series data
Upsampling time series data
Timedeltas
Time series plotting
Chapter 20: Interacting with Databases
Interacting with data in text format
Interacting with data in binary format
Interacting with data in MongoDB
Interacting with data in Redis
Chapter 21: Data Analysis Application Examples
Data munging
Data aggregation
Grouping data
Chapter 22: Getting Started with Data Mining
Introducing data mining
A simple affinity analysis example
A simple classification example
What is classification?
Chapter 23: Classifying with scikit-learn Estimators
scikit-learn estimators
Preprocessing using pipelines
Pipelines
Chapter 24: Predicting Sports Winners with Decision Trees
Loading the dataset
Decision trees
Sports outcome prediction
Random forests
Chapter 25: Recommending Movies Using Affinity Analysis
Affinity analysis
The movie recommendation problem
The Apriori implementation
Extracting association rules
Chapter 26: Extracting Features with Transformers
Feature extraction
Feature selection
Feature creation
Creating your own transformer
Chapter 27: Social Media Insight Using Naive Bayes
Disambiguation
Text transformers
Naive Bayes
Application
Chapter 28: Discovering Accounts to Follow Using Graph Mining
Loading the dataset
Finding subgraphs
Chapter 29: Beating CAPTCHAs with Neural Networks
Artificial neural networks
Creating the dataset
Training and classifying
Improving accuracy using a dictionary
Chapter 30: Authorship Attribution
Attributing documents to authors
Function words
Support vector machines
Character n-grams
Using the Enron dataset
Chapter 31: Clustering News Articles
Obtaining news articles
Extracting text from arbitrary websites
Grouping news articles
Clustering ensembles
Online learning
Chapter 32: Classifying Objects in Images Using Deep Learning
Object classification
Application scenario and goals
Deep neural networks
GPU optimization
Setting up the environment
Application
Chapter 33: Working with Big Data
Big data
Application scenario and goals
MapReduce
Application
Chapter 34: Next Steps…
Chapter 1 – Getting Started with Data Mining
Chapter 2 – Classifying with scikit-learn Estimators
Chapter 3: Predicting Sports Winners with Decision Trees
Chapter 4 – Recommending Movies Using Affinity Analysis
Chapter 5 – Extracting Features with Transformers
Chapter 6 – Social Media Insight Using Naive Bayes
Chapter 7 – Discovering Accounts to Follow Using Graph Mining
Chapter 8 – Beating CAPTCHAs with Neural Networks
Chapter 9 – Authorship Attribution
Chapter 10 – Clustering News Articles
Chapter 11 – Classifying Objects in Images Using Deep Learning
Chapter 12 – Working with Big Data
More resources
Chapter 35: Giving Computers the Ability to Learn from Data
How to transform data into knowledge
The three different types of machine learning
An introduction to the basic terminology and notations
A roadmap for building machine learning systems
Using Python for machine learning
Chapter 36: Training Machine Learning Algorithms for Classification
Artificial neurons – a brief glimpse into the early history of machine learning
Implementing a perceptron learning algorithm in Python
Adaptive linear neurons and the convergence of learning
Chapter 37: A Tour of Machine Learning Classifiers Using scikit-learn
Choosing a classification algorithm
First steps with scikit-learn
Modeling class probabilities via logistic regression
Maximum margin classification with support vector machines
Solving nonlinear problems using a kernel SVM
Decision tree learning
K-nearest neighbors – a lazy learning algorithm
Chapter 38: Building Good Training Sets – Data Preprocessing
Dealing with missing data
Handling categorical data
Partitioning a dataset in training and test sets
Bringing features onto the same scale
Selecting meaningful features
Assessing feature importance with random forests
Chapter 39: Compressing Data via Dimensionality Reduction
Unsupervised dimensionality reduction via principal component analysis
Supervised data compression via linear discriminant analysis
Using kernel principal component analysis for nonlinear mappings
Chapter 40: Learning Best Practices for Model Evaluation and Hyperparameter Tuning
Streamlining workflows with pipelines
Using k-fold cross-validation to assess model performance
Debugging algorithms with learning and validation curves
Fine-tuning machine learning models via grid search
Looking at different performance evaluation metrics
Chapter 41: Combining Different Models for Ensemble Learning
Learning with ensembles
Implementing a simple majority vote classifier
Evaluating and tuning the ensemble classifier
Bagging – building an ensemble of classifiers from bootstrap samples
Leveraging weak learners via adaptive boosting
Chapter 42: Predicting Continuous Target Variables with Regression Analysis
Introducing a simple linear regression model
Exploring the Housing Dataset
Implementing an ordinary least squares linear regression model
Fitting a robust regression model using RANSAC
Evaluating the performance of linear regression models
Using regularized methods for regression
Turning a linear regression model into a curve – polynomial regression

Book Details

ISBN 139781786465160
Paperback1255 pages
Read More
From 2 reviews

Read More Reviews

Recommended for You

Python: Real World Machine Learning Book Cover
Python: Real World Machine Learning
$ 71.99
$ 10.00
Python: Deeper Insights into Machine Learning Book Cover
Python: Deeper Insights into Machine Learning
$ 69.99
$ 10.00
Python: Journey from Novice to Expert Book Cover
Python: Journey from Novice to Expert
$ 69.99
$ 10.00
Python: End-to-end Data Analysis Book Cover
Python: End-to-end Data Analysis
$ 71.99
$ 10.00
Natural Language Processing: Python and NLTK Book Cover
Natural Language Processing: Python and NLTK
$ 67.99
$ 10.00
Python: Data Analytics and Visualization Book Cover
Python: Data Analytics and Visualization
$ 79.99
$ 10.00