Reader small image

You're reading from  Data Science for Marketing Analytics - Second Edition

Product typeBook
Published inSep 2021
Reading LevelIntermediate
PublisherPackt
ISBN-139781800560475
Edition2nd Edition
Languages
Tools
Concepts
Right arrow
Authors (3):
Mirza Rahim Baig
Mirza Rahim Baig
author image
Mirza Rahim Baig

Mirza Rahim Baig is a Data Science and Artificial Intelligence leader with over 13 years of experience across e-commerce, healthcare, and marketing. He currently holds the position of leading Product Analytics at Marketing Services for Zalando, Europe's largest online fashion platform. In addition, he serves as a Subject Matter Expert and faculty member for MS level programs at prominent Ed-Tech platforms and institutes in India. He is also the lead author of two books, 'Data Science for Marketing Analytics' and 'The Deep Learning Workshop,' both published by Packt. He is recognized as a thought leader in my field and frequently participates as a guest speaker at various forums.
Read more about Mirza Rahim Baig

Gururajan Govindan
Gururajan Govindan
author image
Gururajan Govindan

Gururajan Govindan is a data scientist, intrapreneur, and trainer with more than seven years of experience working across domains such as finance and insurance. He is also an author of The Data Analysis Workshop, a book focusing on data analytics. He is well known for his expertise in data-driven decision-making and machine learning with Python.
Read more about Gururajan Govindan

Vishwesh Ravi Shrimali
Vishwesh Ravi Shrimali
author image
Vishwesh Ravi Shrimali

Vishwesh Ravi Shrimali graduated from BITS Pilani, where he studied mechanical engineering, in 2018. He also completed his Masters in Machine Learning and AI from LJMU in 2021. He has authored - Machine learning for OpenCV (2nd edition), Computer Vision Workshop and Data Science for Marketing Analytics (2nd edition) by Packt. When he is not writing blogs or working on projects, he likes to go on long walks or play his acoustic guitar.
Read more about Vishwesh Ravi Shrimali

View More author details
Right arrow

5. Predicting Customer Revenue Using Linear Regression

Overview

In this chapter, you will learn how to solve business problems that require the prediction of quantities. You will learn about regression, a supervised learning approach to predict continuous outcomes. You will explore linear regression, a simple yet powerful technique that is the workhorse for predictive modeling in the industry. Then you will learn how to implement some key steps in the modeling process – feature engineering and data cleaning. Later, you will implement your linear regression models and finally interpret the results to derive business insights.

Introduction

Azra, a large, high-end, fast-fashion retailer that has operations all over the world, has approved its marketing budget for the latest campaign in a particular country. The marketing team is now looking to allocate the budget to each marketing channel, but they have many questions:

  • How much should they spend on email? Read rates are low, but the quality of conversions is high.
  • How about social media? It seems to be an effective channel in general.
  • Should they do any offline promotions? If so, to what extent?
  • How about paid search as a channel?

The company understands that each channel provides a different return on investment (ROI) – that is, some channels are more effective than others. However, all channels should be considered, nonetheless. Naturally, different distributions of allocation to these channels would provide different results. Incredibly low spending on a channel with great ROI is missed potential and high spending on an extremely...

Regression Problems

The prediction of quantities is a recurring task in marketing. Predicting the units sold for a brand based on the spend on the visibility (impressions) allocated to the brand's products is an example. Another example could be predicting sales based on the advertising spend on television campaigns. Predicting the lifetime value of a customer (the total revenue a customer brings over a defined period) based on a customer's attributes is another common requirement. All these situations can be formulated as regression problems.

Note

A common misconception is that regression is a specific algorithm/technique. Regression is a much broader term that refers to a class of problems. Many equate regression to linear regression, which is only one of the many techniques that can be employed to solve a regression problem.

Regression refers to a class of problems where the value to predict is a quantity. There are various techniques available for regression...

Feature Engineering for Regression

Raw data is a term that is used to refer to the data as you obtain it from the source – without any manipulation from your side. Rarely, a raw dataset can directly be employed for a modeling activity. Often, you perform multiple manipulations on data and the act of doing so is termed feature engineering. In simple terms, feature engineering is the process of taking data and transforming it into features for use in predictions. There can be multiple motivations for feature engineering:

  • Creating features that capture aspects of what is important to the outcome of interest (for example, creating an average order value, which could be more useful for predicting revenue from a customer, instead of using the number of orders and total revenue)
  • Using your domain understanding (for example, flagging certain high-value indicators for predicting revenue from a customer)
  • Aggregating variables to the required level (for example, creating customer...

Performing and Interpreting Linear Regression

In Exercise 5.01, Predicting Sales from Advertising Spend Using Linear Regression, we implemented and saw the output of a linear regression model without discussing the inner workings. Let us understand the technique of linear regression better now. Linear regression is a type of regression model that predicts the outcome using linear relationships between predictors and the outcome. Linear regression models can be thought of as a line running through the feature space that minimizes the distance between the line and the data points.

The model that a linear regression learns is the equation of this line. It is an equation that expresses the dependent variable as a linear function of the independent variables. This is best visualized when there is a single predictor (see Figure 5.28). In such a case, you can draw a line that best fits the data on a scatter plot between the two variables.

Figure 5.28: A visualization of a linear regression...

Summary

In this chapter, you explored a new approach to machine learning, that is, supervised machine learning, and saw how it can help a business make predictions. These predictions come from models that the algorithm learns. The models are essentially mathematical expressions of the relationship between the predictor variables and the target. You learned about linear regression – a simple, interpretable, and therefore powerful tool for businesses to predict quantities. You saw that feature engineering and data cleanup play an important role in the process of predictive modeling and then built and interpreted your linear regression models using scikit-learn. In this chapter, you also used some rudimentary approaches to evaluate the performance of the model. Linear regression is an extremely useful and interpretable technique, but it has its drawbacks.

In the next chapter, you will expand your repertoire to include more approaches to predicting quantities and will explore...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Data Science for Marketing Analytics - Second Edition
Published in: Sep 2021Publisher: PacktISBN-13: 9781800560475
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (3)

author image
Mirza Rahim Baig

Mirza Rahim Baig is a Data Science and Artificial Intelligence leader with over 13 years of experience across e-commerce, healthcare, and marketing. He currently holds the position of leading Product Analytics at Marketing Services for Zalando, Europe's largest online fashion platform. In addition, he serves as a Subject Matter Expert and faculty member for MS level programs at prominent Ed-Tech platforms and institutes in India. He is also the lead author of two books, 'Data Science for Marketing Analytics' and 'The Deep Learning Workshop,' both published by Packt. He is recognized as a thought leader in my field and frequently participates as a guest speaker at various forums.
Read more about Mirza Rahim Baig

author image
Gururajan Govindan

Gururajan Govindan is a data scientist, intrapreneur, and trainer with more than seven years of experience working across domains such as finance and insurance. He is also an author of The Data Analysis Workshop, a book focusing on data analytics. He is well known for his expertise in data-driven decision-making and machine learning with Python.
Read more about Gururajan Govindan

author image
Vishwesh Ravi Shrimali

Vishwesh Ravi Shrimali graduated from BITS Pilani, where he studied mechanical engineering, in 2018. He also completed his Masters in Machine Learning and AI from LJMU in 2021. He has authored - Machine learning for OpenCV (2nd edition), Computer Vision Workshop and Data Science for Marketing Analytics (2nd edition) by Packt. When he is not writing blogs or working on projects, he likes to go on long walks or play his acoustic guitar.
Read more about Vishwesh Ravi Shrimali