Learning Data Mining with Python

Harness the power of Python to analyze data and create insightful predictive models

Learning Data Mining with Python

Robert Layton

1 customer reviews
Harness the power of Python to analyze data and create insightful predictive models
Mapt Subscription
FREE
$29.99/m after trial
eBook
$25.20
RRP $35.99
Save 29%
Print + eBook
$44.99
RRP $44.99
What do I get with a Mapt Pro subscription?
  • Unlimited access to all Packt’s 5,000+ eBooks and Videos
  • Early Access content, Progress Tracking, and Assessments
  • 1 Free eBook or Video to download and keep every month after trial
What do I get with an eBook?
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with Print & eBook?
  • Get a paperback copy of the book delivered to you
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with a Video?
  • Download this Video course in MP4 format
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
$0.00
$25.20
$44.99
$29.99p/m after trial
RRP $35.99
RRP $44.99
Subscription
eBook
Print + eBook
Start 30 Day Trial
Subscribe and access every Packt eBook & Video.
 
  • 5,000+ eBooks & Videos
  • 50+ New titles a month
  • 1 Free eBook/Video to keep every month
Start Free Trial
 
Preview in Mapt

Book Details

ISBN 139781784396053
Paperback344 pages

Book Description

The next step in the information age is to gain insights from the deluge of data coming our way. Data mining provides a way of finding this insight, and Python is one of the most popular languages for data mining, providing both power and flexibility in analysis.

This book teaches you to design and develop data mining applications using a variety of datasets, starting with basic classification and affinity analysis. Next, we move on to more complex data types including text, images, and graphs. In every chapter, we create models that solve real-world problems.

There is a rich and varied set of libraries available in Python for data mining. This book covers a large number, including the IPython Notebook, pandas, scikit-learn and NLTK.

Each chapter of this book introduces you to new algorithms and techniques. By the end of the book, you will gain a large insight into using Python for data mining, with a good knowledge and understanding of the algorithms and implementations.

Table of Contents

Chapter 1: Getting Started with Data Mining
Introducing data mining
Using Python and the IPython Notebook
A simple affinity analysis example
A simple classification example
What is classification?
Summary
Chapter 2: Classifying with scikit-learn Estimators
scikit-learn estimators
Preprocessing using pipelines
Pipelines
Summary
Chapter 3: Predicting Sports Winners with Decision Trees
Loading the dataset
Decision trees
Sports outcome prediction
Random forests
Summary
Chapter 4: Recommending Movies Using Affinity Analysis
Affinity analysis
The movie recommendation problem
The Apriori implementation
Extracting association rules
Summary
Chapter 5: Extracting Features with Transformers
Feature extraction
Feature selection
Feature creation
Creating your own transformer
Summary
Chapter 6: Social Media Insight Using Naive Bayes
Disambiguation
Text transformers
Naive Bayes
Application
Summary
Chapter 7: Discovering Accounts to Follow Using Graph Mining
Loading the dataset
Finding subgraphs
Summary
Chapter 8: Beating CAPTCHAs with Neural Networks
Artificial neural networks
Creating the dataset
Training and classifying
Improving accuracy using a dictionary
Summary
Chapter 9: Authorship Attribution
Attributing documents to authors
Function words
Support vector machines
Character n-grams
Using the Enron dataset
Summary
Chapter 10: Clustering News Articles
Obtaining news articles
Extracting text from arbitrary websites
Grouping news articles
Clustering ensembles
Online learning
Summary
Chapter 11: Classifying Objects in Images Using Deep Learning
Object classification
Application scenario and goals
Deep neural networks
GPU optimization
Setting up the environment
Application
Summary
Chapter 12: Working with Big Data
Big data
Application scenario and goals
MapReduce
Application
Summary

What You Will Learn

  • Apply data mining concepts to real-world problems
  • Predict the outcome of sports matches based on past results
  • Determine the author of a document based on their writing style
  • Use APIs to download datasets from social media and other online services
  • Find and extract good features from difficult datasets
  • Create models that solve real-world problems
  • Design and develop data mining applications using a variety of datasets
  • Set up reproducible experiments and generate robust results
  • Recommend movies, online celebrities, and news articles based on personal preferences
  • Compute on big data, including real-time data from the Internet

Authors

Table of Contents

Chapter 1: Getting Started with Data Mining
Introducing data mining
Using Python and the IPython Notebook
A simple affinity analysis example
A simple classification example
What is classification?
Summary
Chapter 2: Classifying with scikit-learn Estimators
scikit-learn estimators
Preprocessing using pipelines
Pipelines
Summary
Chapter 3: Predicting Sports Winners with Decision Trees
Loading the dataset
Decision trees
Sports outcome prediction
Random forests
Summary
Chapter 4: Recommending Movies Using Affinity Analysis
Affinity analysis
The movie recommendation problem
The Apriori implementation
Extracting association rules
Summary
Chapter 5: Extracting Features with Transformers
Feature extraction
Feature selection
Feature creation
Creating your own transformer
Summary
Chapter 6: Social Media Insight Using Naive Bayes
Disambiguation
Text transformers
Naive Bayes
Application
Summary
Chapter 7: Discovering Accounts to Follow Using Graph Mining
Loading the dataset
Finding subgraphs
Summary
Chapter 8: Beating CAPTCHAs with Neural Networks
Artificial neural networks
Creating the dataset
Training and classifying
Improving accuracy using a dictionary
Summary
Chapter 9: Authorship Attribution
Attributing documents to authors
Function words
Support vector machines
Character n-grams
Using the Enron dataset
Summary
Chapter 10: Clustering News Articles
Obtaining news articles
Extracting text from arbitrary websites
Grouping news articles
Clustering ensembles
Online learning
Summary
Chapter 11: Classifying Objects in Images Using Deep Learning
Object classification
Application scenario and goals
Deep neural networks
GPU optimization
Setting up the environment
Application
Summary
Chapter 12: Working with Big Data
Big data
Application scenario and goals
MapReduce
Application
Summary

Book Details

ISBN 139781784396053
Paperback344 pages
Read More
From 1 reviews

Read More Reviews

Recommended for You

R Data Mining Blueprints Book Cover
R Data Mining Blueprints
$ 35.99
$ 25.20
Python Machine Learning Book Cover
Python Machine Learning
$ 35.99
$ 25.20
Python Data Analysis Book Cover
Python Data Analysis
$ 29.99
$ 21.00
Practical Data Science Cookbook Book Cover
Practical Data Science Cookbook
$ 29.99
$ 21.00
Building Machine Learning Systems with Python Book Cover
Building Machine Learning Systems with Python
$ 29.99
$ 6.00
Mastering Object-oriented Python Book Cover
Mastering Object-oriented Python
$ 26.99
$ 18.90