Mastering Python for Data Science

Explore the world of data science through Python and learn how to make sense of data
Preview in Mapt

Mastering Python for Data Science

Samir Madhavan

1 customer reviews
Explore the world of data science through Python and learn how to make sense of data
Mapt Subscription
FREE
$29.99/m after trial
eBook
$10.00
RRP $43.99
Save 77%
Print + eBook
$54.99
RRP $54.99
What do I get with a Mapt Pro subscription?
  • Unlimited access to all Packt’s 5,000+ eBooks and Videos
  • Early Access content, Progress Tracking, and Assessments
  • 1 Free eBook or Video to download and keep every month after trial
What do I get with an eBook?
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with Print & eBook?
  • Get a paperback copy of the book delivered to you
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with a Video?
  • Download this Video course in MP4 format
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
$0.00
$10.00
$54.99
$29.99 p/m after trial
RRP $43.99
RRP $54.99
Subscription
eBook
Print + eBook
Start 14 Day Trial

Frequently bought together


Mastering Python for Data Science Book Cover
Mastering Python for Data Science
$ 43.99
$ 10.00
Advanced Machine Learning with Python Book Cover
Advanced Machine Learning with Python
$ 35.99
$ 18.00
Buy 2 for $20.00
Save $59.98
Add to Cart

Book Details

ISBN 139781784390150
Paperback294 pages

Book Description

Data science is a relatively new knowledge domain which is used by various organizations to make data driven decisions. Data scientists have to wear various hats to work with data and to derive value from it. The Python programming language, beyond having conquered the scientific community in the last decade, is now an indispensable tool for the data science practitioner and a must-know tool for every aspiring data scientist. Using Python will offer you a fast, reliable, cross-platform, and mature environment for data analysis, machine learning, and algorithmic problem solving.

This comprehensive guide helps you move beyond the hype and transcend the theory by providing you with a hands-on, advanced study of data science.

Beginning with the essentials of Python in data science, you will learn to manage data and perform linear algebra in Python. You will move on to deriving inferences from the analysis by performing inferential statistics, and mining data to reveal hidden patterns and trends. You will use the matplot library to create high-end visualizations in Python and uncover the fundamentals of machine learning. Next, you will apply the linear regression technique and also learn to apply the logistic regression technique to your applications, before creating recommendation engines with various collaborative filtering algorithms and improving your predictions by applying the ensemble methods.

Finally, you will perform K-means clustering, along with an analysis of unstructured data with different text mining techniques and leveraging the power of Python in big data analytics.

Table of Contents

Chapter 1: Getting Started with Raw Data
The world of arrays with NumPy
Empowering data analysis with pandas
Data cleansing
Data operations
Summary
Chapter 2: Inferential Statistics
Various forms of distribution
A z-score
A p-value
One-tailed and two-tailed tests
Type 1 and Type 2 errors
A confidence interval
Correlation
Z-test vs T-test
The F distribution
The chi-square distribution
The chi-square test of independence
ANOVA
Summary
Chapter 3: Finding a Needle in a Haystack
What is data mining?
Presenting an analysis
Studying the Titanic
Summary
Chapter 4: Making Sense of Data through Advanced Visualization
Controlling the line properties of a chart
Creating multiple plots
Playing with text
Styling your plots
Box plots
Heatmaps
Scatter plots with histograms
A scatter plot matrix
Area plots
Bubble charts
Hexagon bin plots
Trellis plots
A 3D plot of a surface
Summary
Chapter 5: Uncovering Machine Learning
Different types of machine learning
Decision trees
Linear regression
Logistic regression
The naive Bayes classifier
The k-means clustering
Hierarchical clustering
Summary
Chapter 6: Performing Predictions with a Linear Regression
Simple linear regression
Multiple regression
Training and testing a model
Summary
Chapter 7: Estimating the Likelihood of Events
Logistic regression
Summary
Chapter 8: Generating Recommendations with Collaborative Filtering
Recommendation data
User-based collaborative filtering
Item-based collaborative filtering
Summary
Chapter 9: Pushing Boundaries with Ensemble Models
The census income dataset
Decision trees
Random forests
Summary
Chapter 10: Applying Segmentation with k-means Clustering
The k-means algorithm and its working
The k-means clustering with countries
Clustering the countries
Summary
Chapter 11: Analyzing Unstructured Data with Text Mining
Preprocessing data
Creating a wordcloud
Word and sentence tokenization
Parts of speech tagging
Stemming and lemmatization
The Stanford Named Entity Recognizer
Performing sentiment analysis on world leaders using Twitter
Summary
Chapter 12: Leveraging Python in the World of Big Data
What is Hadoop?
Python MapReduce
File handling with Hadoopy
Pig
Python with Apache Spark
Summary

What You Will Learn

  • Manage data and perform linear algebra in Python
  • Derive inferences from the analysis by performing inferential statistics
  • Solve data science problems in Python
  • Create high-end visualizations using Python
  • Evaluate and apply the linear regression technique to estimate the relationships among variables.
  • Build recommendation engines with the various collaborative filtering algorithms
  • Apply the ensemble methods to improve your predictions
  • Work with big data technologies to handle data at scale

Authors

Table of Contents

Chapter 1: Getting Started with Raw Data
The world of arrays with NumPy
Empowering data analysis with pandas
Data cleansing
Data operations
Summary
Chapter 2: Inferential Statistics
Various forms of distribution
A z-score
A p-value
One-tailed and two-tailed tests
Type 1 and Type 2 errors
A confidence interval
Correlation
Z-test vs T-test
The F distribution
The chi-square distribution
The chi-square test of independence
ANOVA
Summary
Chapter 3: Finding a Needle in a Haystack
What is data mining?
Presenting an analysis
Studying the Titanic
Summary
Chapter 4: Making Sense of Data through Advanced Visualization
Controlling the line properties of a chart
Creating multiple plots
Playing with text
Styling your plots
Box plots
Heatmaps
Scatter plots with histograms
A scatter plot matrix
Area plots
Bubble charts
Hexagon bin plots
Trellis plots
A 3D plot of a surface
Summary
Chapter 5: Uncovering Machine Learning
Different types of machine learning
Decision trees
Linear regression
Logistic regression
The naive Bayes classifier
The k-means clustering
Hierarchical clustering
Summary
Chapter 6: Performing Predictions with a Linear Regression
Simple linear regression
Multiple regression
Training and testing a model
Summary
Chapter 7: Estimating the Likelihood of Events
Logistic regression
Summary
Chapter 8: Generating Recommendations with Collaborative Filtering
Recommendation data
User-based collaborative filtering
Item-based collaborative filtering
Summary
Chapter 9: Pushing Boundaries with Ensemble Models
The census income dataset
Decision trees
Random forests
Summary
Chapter 10: Applying Segmentation with k-means Clustering
The k-means algorithm and its working
The k-means clustering with countries
Clustering the countries
Summary
Chapter 11: Analyzing Unstructured Data with Text Mining
Preprocessing data
Creating a wordcloud
Word and sentence tokenization
Parts of speech tagging
Stemming and lemmatization
The Stanford Named Entity Recognizer
Performing sentiment analysis on world leaders using Twitter
Summary
Chapter 12: Leveraging Python in the World of Big Data
What is Hadoop?
Python MapReduce
File handling with Hadoopy
Pig
Python with Apache Spark
Summary

Book Details

ISBN 139781784390150
Paperback294 pages
Read More
From 1 reviews

Read More Reviews

Recommended for You

Advanced Machine Learning with Python Book Cover
Advanced Machine Learning with Python
$ 35.99
$ 18.00
Python Machine Learning - Second Edition Book Cover
Python Machine Learning - Second Edition
$ 31.99
$ 16.00
Python: End-to-end Data Analysis Book Cover
Python: End-to-end Data Analysis
$ 71.99
$ 36.00
Practical Machine Learning Book Cover
Practical Machine Learning
$ 37.99
$ 19.00
Python Data Science Essentials - Second Edition Book Cover
Python Data Science Essentials - Second Edition
$ 35.99
$ 18.00
Python: Data Analytics and Visualization Book Cover
Python: Data Analytics and Visualization
$ 79.99
$ 40.00