Reader small image

You're reading from  Data Science for Marketing Analytics - Second Edition

Product typeBook
Published inSep 2021
Reading LevelIntermediate
PublisherPackt
ISBN-139781800560475
Edition2nd Edition
Languages
Tools
Concepts
Right arrow
Authors (3):
Mirza Rahim Baig
Mirza Rahim Baig
author image
Mirza Rahim Baig

Mirza Rahim Baig is a Data Science and Artificial Intelligence leader with over 13 years of experience across e-commerce, healthcare, and marketing. He currently holds the position of leading Product Analytics at Marketing Services for Zalando, Europe's largest online fashion platform. In addition, he serves as a Subject Matter Expert and faculty member for MS level programs at prominent Ed-Tech platforms and institutes in India. He is also the lead author of two books, 'Data Science for Marketing Analytics' and 'The Deep Learning Workshop,' both published by Packt. He is recognized as a thought leader in my field and frequently participates as a guest speaker at various forums.
Read more about Mirza Rahim Baig

Gururajan Govindan
Gururajan Govindan
author image
Gururajan Govindan

Gururajan Govindan is a data scientist, intrapreneur, and trainer with more than seven years of experience working across domains such as finance and insurance. He is also an author of The Data Analysis Workshop, a book focusing on data analytics. He is well known for his expertise in data-driven decision-making and machine learning with Python.
Read more about Gururajan Govindan

Vishwesh Ravi Shrimali
Vishwesh Ravi Shrimali
author image
Vishwesh Ravi Shrimali

Vishwesh Ravi Shrimali graduated from BITS Pilani, where he studied mechanical engineering, in 2018. He also completed his Masters in Machine Learning and AI from LJMU in 2021. He has authored - Machine learning for OpenCV (2nd edition), Computer Vision Workshop and Data Science for Marketing Analytics (2nd edition) by Packt. When he is not writing blogs or working on projects, he likes to go on long walks or play his acoustic guitar.
Read more about Vishwesh Ravi Shrimali

View More author details
Right arrow

3. Unsupervised Learning and Customer Segmentation

Overview

In this chapter, you will implement one of the most powerful techniques in marketing – customer segmentation. You will begin by understanding the need for customer segmentation, following which you will study and implement the machine learning approach to segmentation. You will use the k-means clustering algorithm to segment customers and then analyze the obtained segments to gain an understanding of the results so that businesses can act on them.

By the end of this chapter, you will be able to perform segmentation using relevant tools and techniques and analyze the results of the segmented data. You will also be comfortable with using the k-means algorithm, a machine learning approach to segmentation.

Introduction

Put yourself in the shoes of the marketing head of an e-commerce company with a base of 1 million transacting customers. You want to make the marketing campaigns more effective, reaching the right customer with the right messaging. You know that by understanding the customer and their needs better, marketing campaigns could provide a significant boost to the business. As you begin solving this problem, you think about the customer experience. An average customer receives several communications from your platform about the latest offers and programs. These are relayed via email, push notifications, and social media campaigns. This may not be a great experience for them, especially if these communications are generic/mass campaigns. If the company understood the customers' needs better and sent them the relevant content, they would shop much more frequently.

Several examples like this show that a deep understanding of customers and their needs is beneficial not only...

Segmentation

Segmentation, simply put, means grouping similar entities together. The entities of each group are similar to each other, that is, "the groups are homogenous," meaning the entities have similar properties. Before going further, we need to understand two key aspects here – entities and properties.

What entities can be segmented? You can segment customers, products, offers, vehicles, fruits, animals, countries, or even stars. If you can express, through data, the properties of the entity, you can compare that entity with other entities and segment it. In this chapter, we will focus on customer segmentation – that is, grouping and segmenting present/potential customers, an exercise that has tremendous utility in marketing.

Coming to the second key aspect, what properties are we talking about? We are talking about properties relevant to the grouping exercise. Say you are trying to group customers based on their purchase frequency of a product...

Approaches to Segmentation

Every marketing group does, in effect, some amount of customer segmentation. However, the methods they use to do this might not always be clear. These may be based on intuitions and hunches about certain demographic groups, or they might be the output of some marketing software, where the methods used are obscure. There are advantages and disadvantages to every possible method and understanding them allows you to make use of the right tool for the job. In the following sections, we will discuss some of the most commonly used approaches for customer segmentation along with considerations when using such approaches.

Traditional Segmentation Methods

A preferred method for marketing analysts consists of coming up with rough groupings based on intuitions and arbitrary thresholds. For this, they leverage whatever data about customers they have at their disposal – typically demographic or behavioral. An example of this would be deciding to segment customers...

Choosing Relevant Attributes (Segmentation Criteria)

To use clustering for customer segmentation (to group customers with other customers who have similar traits), you first have to decide what similar means, or in other words, you need to be precise when defining what kinds of customers are similar. Choosing the properties that go into the segmentation process is an extremely important decision as it defines how the entities are represented and directs the nature of the groups formed.

Let's say we wish to segment customers solely by their purchase frequency and transaction value. In such a situation, attributes such as age, gender, or other demographic data would not be relevant. On the other hand, if the intent is to segment customers purely on a demographic basis, their purchase frequency and transaction value would be the attributes that won't be relevant to us.

A good criterion for segmentation could be customer engagement, involving features such as time spent...

K-Means Clustering

K-means clustering is a very common unsupervised learning technique with a wide range of applications. It is powerful because it is conceptually relatively simple, scales to very large datasets, and tends to work well in practice. In this section, you will learn the conceptual foundations of k-means clustering, how to apply k-means clustering to data, and how to deal with high-dimensional data (that is, data with many different variables) in the context of clustering.

K-means clustering is an algorithm that tries to find the best way of grouping data points into k different groups, where k is a parameter given to the algorithm. For now, we will choose k arbitrarily. We will revisit how to choose k in practice in the next chapter. The algorithm then works iteratively to try to find the best grouping. There are two steps to this algorithm:

  1. The algorithm begins by randomly selecting k points in space to be the centroids of the clusters. Each data point is then...

Summary

In this chapter, we explored the idea of segmentation and its utility for business. We discussed the key considerations in segmentation, namely, criteria/features and the interpretation of the segments. We first discussed and implemented a traditional approach to customer segmentation. Noting its drawbacks, we then explored and performed unsupervised machine learning for customer segmentation. We established how to think about the similarity in the customer data feature space, and also learned the importance of standardizing data if it is on very different scales. Finally, we learned about k-means clustering – a commonly used, fast, and easily scalable clustering algorithm. We employed these concepts and techniques to help a mall understand its customers better using segmentation. We also helped a bank identify customer segments and how they have responded to previous marketing campaigns.

In this chapter, we used predefined values for the number of groups we asked...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Data Science for Marketing Analytics - Second Edition
Published in: Sep 2021Publisher: PacktISBN-13: 9781800560475
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (3)

author image
Mirza Rahim Baig

Mirza Rahim Baig is a Data Science and Artificial Intelligence leader with over 13 years of experience across e-commerce, healthcare, and marketing. He currently holds the position of leading Product Analytics at Marketing Services for Zalando, Europe's largest online fashion platform. In addition, he serves as a Subject Matter Expert and faculty member for MS level programs at prominent Ed-Tech platforms and institutes in India. He is also the lead author of two books, 'Data Science for Marketing Analytics' and 'The Deep Learning Workshop,' both published by Packt. He is recognized as a thought leader in my field and frequently participates as a guest speaker at various forums.
Read more about Mirza Rahim Baig

author image
Gururajan Govindan

Gururajan Govindan is a data scientist, intrapreneur, and trainer with more than seven years of experience working across domains such as finance and insurance. He is also an author of The Data Analysis Workshop, a book focusing on data analytics. He is well known for his expertise in data-driven decision-making and machine learning with Python.
Read more about Gururajan Govindan

author image
Vishwesh Ravi Shrimali

Vishwesh Ravi Shrimali graduated from BITS Pilani, where he studied mechanical engineering, in 2018. He also completed his Masters in Machine Learning and AI from LJMU in 2021. He has authored - Machine learning for OpenCV (2nd edition), Computer Vision Workshop and Data Science for Marketing Analytics (2nd edition) by Packt. When he is not writing blogs or working on projects, he likes to go on long walks or play his acoustic guitar.
Read more about Vishwesh Ravi Shrimali