Packt+ | Advance your knowledge in tech

You're reading from Learning Data Mining with Python, - Second Edition

Product type Book

Published in Apr 2017

Publisher Packt

ISBN-13 9781787126787

Pages 358 pages

Edition 2nd Edition

Languages

Python

Concepts

Data Mining

Table of Contents (20) Chapters

Title Page

Credits

About the Author

About the Reviewer

www.PacktPub.com

Customer Feedback

Preface

1. Getting Started with Data Mining

2. Classifying with scikit-learn Estimators

3. Predicting Sports Winners with Decision Trees

4. Recommending Movies Using Affinity Analysis

5. Features and scikit-learn Transformers

6. Social Media Insight using Naive Bayes

7. Follow Recommendations Using Graph Mining

8. Beating CAPTCHAs with Neural Networks

9. Authorship Attribution

10. Clustering News Articles

11. Object Detection in Images using Deep Neural Networks

12. Working with Big Data

13. Next Steps...

A simple affinity analysis example

In this section, we jump into our first example. A common use case for data mining is to improve sales, by asking a customer who is buying a product if he/she would like another similar product as well. You can perform this analysis through affinity analysis, which is the study of when things exist together, namely. correlate to each other.

To repeat the now-infamous phrase taught in statistics classes, correlation is not causation. This phrase means that the results from affinity analysis cannot give a cause. In our next example, we perform affinity analysis on product purchases. The results indicate that the products are purchased together, but not that buying one product causes the purchase of the other. The distinction is important, critically so when determining how to use the results to affect a business process, for instance.

What is affinity analysis?

Affinity analysis is a type of data mining that gives similarity between samples (objects). This could be the similarity between the following:

Users on a website, to provide varied services or targeted advertising
Items to sell to those users, to provide recommended movies or products
Human genes, to find people that share the same ancestors

We can measure affinity in several ways. For instance, we can record how frequently two products are purchased together. We can also record the accuracy of the statement when a person buys object 1 and when they buy object 2. Other ways to measure affinity include computing the similarity between samples, which we will cover in later chapters.