You're reading from Artificial Intelligence with Python - Second Edition

Product typeBook

Published inJan 2020

Reading LevelBeginner

PublisherPackt

ISBN-139781839219535

Edition2nd Edition

Languages

Python

Tools

TensorFlow

Concepts

Artificial Intelligence

Author (1)

Prateek Joshi

Detecting Patterns with Unsupervised Learning

In this chapter, we are going to learn about unsupervised learning and how to use it in real-world situations. By the end of this chapter, you will have a better understanding of the following topics:

Unsupervised learning definition
Clustering data with the K-Means algorithm
Estimating the number of clusters with the Mean Shift algorithm
Estimating the quality of clustering with silhouette scores
Gaussian Mixture Models
Building a classifier based on Gaussian Mixture Models
Finding subgroups in stock markets the using Affinity Propagation model
Segmenting the market based on shopping patterns

What is unsupervised learning?

Unsupervised learning refers to the process of building machine learning models without using labeled training data. Unsupervised learning finds applications in diverse fields of study, including market segmentation, stock markets, natural language processing, and computer vision, to name a few.

In the previous chapters, we were dealing with data that had labels associated with it. When we have labeled training data, algorithms learn to classify data based on those labels. In the real world, labeled data might not always be available.

Sometimes, a large quantity of data exists without labeling and it needs to be categorized in some way. This is the perfect use case for unsupervised learning. Unsupervised learning algorithms attempt to classify data into subgroups within a given dataset using some similarity metric.

When we have a dataset without any labels, we assume that the data is generated because of latent variables that govern the...

Clustering data with the K-Means algorithm

Clustering is one of the most popular unsupervised learning techniques. This technique is used to analyze data and find clusters within that data. In order to find these clusters, we use a similarity measurement such as the Euclidean distance to find subgroups. This similarity measure can estimate the tightness of a cluster. Clustering is the process of organizing data into subgroups whose elements are like each other.

The goal of the algorithm is to identify the intrinsic properties of data points that make them belong to the same subgroup. There is no universal similarity metric that works in all cases. For example, we might be interested in finding the representative data point for each subgroup, or we might be interested in finding the outliers in the data. Depending on the situation, different metrics might be more appropriate than others.

The K-Means algorithm is a well-known algorithm for clustering data...

What are Gaussian Mixture Models?

Before we discuss Gaussian Mixture Models (GMMs), let's first understand what a Mixture Model is. A Mixture Model is a type of probability density model where it is assumed that the data is governed by several component distributions. If these distributions are Gaussian, then the model becomes a Gaussian Mixture Model. These component distributions are combined in order to provide a multi-modal density function, which becomes a mixture model.

Let's look at an example to understand how Mixture Models work. We want to model the shopping habits of all the people in South America. One way to do it would be to model the whole continent and fit everything into a single model, but people in different countries shop differently. We therefore need to understand how people in individual countries shop and how they behave.

To get a good representative model, we need to account for all the variations within the continent. In this case, we can use...

Finding subgroups in stock market using the Affinity Propagation model

Affinity Propagation is a clustering algorithm that doesn't require a number of clusters to be specified beforehand. Because of its generic nature and simplicity of implementation, it has found a lot of applications in many fields. It finds out representative clusters, called exemplars, using a technique called message passing. It starts by specifying the measures of similarity that need to be considered. It simultaneously considers all training data points as potential exemplars. It then passes messages between the data points until it finds a set of exemplars.

The message passing happens in two alternate steps, called responsibility and availability. Responsibility refers to the message sent from members of the cluster to candidate exemplars, indicating how well suited the data point would be as a member of this exemplar's cluster. Availability refers to the message sent from candidate exemplars...

Segmenting the market based on shopping patterns

Let's see how to apply unsupervised learning techniques to segment the market based on customer shopping habits. You have been provided with a file named sales.csv. This file contains the sales details of a variety of tops from several retail clothing stores. The goal is to identify the patterns and segment the market based on the number of units sold in those stores.

Create a new Python file and import the following packages:

import csv

import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import MeanShift, estimate_bandwidth

Load the data from the input file. Since it's a CSV file, we can use the csv reader in Python to read the data from this file and convert it into a NumPy array:

# Load data from input file
input_file = 'sales.csv'
file_reader = csv.reader(open(input_file, 'r'), delimiter=',')
X = []
for count, row in enumerate...

Summary

In this chapter, we started by discussing unsupervised learning and its applications. We then learned about clustering and how to cluster data using the K-Means algorithm. We discussed how to estimate the number of clusters with the Mean Shift algorithm. We talked about silhouette scores and how to estimate the quality of clustering. We learned about Gaussian Mixture Models and how to build a classifier based on them. We also discussed the Affinity Propagation model and used it to find subgroups within the stock market. We then applied the Mean Shift algorithm to segment the market based on shopping patterns.

In the next chapter, we will learn how to build a recommendation engine.

The rest of the chapter is locked

You have been reading a chapter from

Artificial Intelligence with Python - Second Edition

Published in: Jan 2020Publisher: PacktISBN-13: 9781839219535

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Prateek Joshi

Prateek Joshi is the founder of Plutoshift and a published author of 9 books on Artificial Intelligence. He has been featured on Forbes 30 Under 30, NBC, Bloomberg, CNBC, TechCrunch, and The Business Journals. He has been an invited speaker at conferences such as TEDx, Global Big Data Conference, Machine Learning Developers Conference, and Silicon Valley Deep Learning. Apart from Artificial Intelligence, some of the topics that excite him are number theory, cryptography, and quantum computing. His greater goal is to make Artificial Intelligence accessible to everyone so that it can impact billions of people around the world.
Read more about Prateek Joshi

Other recommended products

Related to this chapter

Python Machine Learning Cookbook

With this book, you will learn how to perform various machine learning tasks in different environments. You’ll use a wide variety of machine learning algorithms using Python to solve real-world problems. By the end of the book, you will learn to implement most used machine learning algorithms using complex datasets and optimized techniques.

BookMar 2019642 pages

OpenCV 3.x with Python By Example

Computer vision is found everywhere in modern technology. OpenCV for Python enables us to run computer vision algorithms in real time. With the advent of powerful machines, we have more processing power to work with. Using this technology, we can seamlessly integrate our computer vision applications into the cloud. Focusing on OpenCV 3.x and Python 3.6, this book will walk you through all the building blocks needed to build amazing computer vision applications with ease.

BookJan 2018268 pages

Learn OpenCV 4 By Building Projects

OpenCV is mainly used in Computer Vision and image processing and is considered to be one of the best open source libraries that helps developers focus on constructing complete projects on image processing, motion detection, and image segmentation. This book will be your guide to understanding the basic OpenCV concepts and algorithms.

BookNov 2018310 pages

Artificial Intelligence and Machine Learning Fundamentals

Artificial Intelligence and Machine Learning Fundamentals teaches you machine learning and neural networks from the ground up using real-world examples. After you complete this book, you will be excited to revamp your current projects or build new intelligent networks.

BookDec 2018330 pages

Hands-On Genetic Algorithms with Python

Using this book, you will gain expertise in genetic algorithms, understand how they work and know when and how to use them to create intelligent Python-based applications. By the end of this book, you will have hands-on experience applying genetic algorithms to artificial intelligence as well as numerous other domains.

BookJan 2020346 pages

The Applied Artificial Intelligence Workshop

The Applied Artificial Intelligence Workshop teaches you the ins and outs of machine learning and neural networks from the ground up, using real-world examples. You'll learn to develop AI and ML models using Python, starting with using the minmax algorithm and alpha-beta pruning to create your first game, and ending with classifying images using neural networks.

BookJul 2020420 pages

Artificial Intelligence for Big Data

Create smart systems to extract intelligent insights for decision making. You will learn about widely used Artificial Intelligence techniques for carrying out solutions in a production-ready environment. You'll explore advanced topics such as clustering, symbolic and sub-symbolic information representation, and many more.

BookMay 2018384 pages

Hands-On Artificial Intelligence for IoT

The book will help you get well-versed with different techniques in Artificial Intelligence such as machine learning, deep learning, natural language processing and more to build smart IoT systems. By the end of the book, you will have practical knowledge on how to implement and manipulate text, audio, and speech data within the IoT system.

BookJan 2019390 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages