Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Data Science for Marketing Analytics - Second Edition

You're reading from  Data Science for Marketing Analytics - Second Edition

Product type Book
Published in Sep 2021
Publisher Packt
ISBN-13 9781800560475
Pages 636 pages
Edition 2nd Edition
Languages
Authors (3):
Mirza Rahim Baig Mirza Rahim Baig
Profile icon Mirza Rahim Baig
Gururajan Govindan Gururajan Govindan
Profile icon Gururajan Govindan
Vishwesh Ravi Shrimali Vishwesh Ravi Shrimali
Profile icon Vishwesh Ravi Shrimali
View More author details

Table of Contents (11) Chapters

Preface
1. Data Preparation and Cleaning 2. Data Exploration and Visualization 3. Unsupervised Learning and Customer Segmentation 4. Evaluating and Choosing the Best Segmentation Approach 5. Predicting Customer Revenue Using Linear Regression 6. More Tools and Techniques for Evaluating Regression Models 7. Supervised Learning: Predicting Customer Churn 8. Fine-Tuning Classification Algorithms 9. Multiclass Classification Algorithms Appendix

1. Data Preparation and Cleaning

Activity 1.01: Addressing Data Spilling

Solution:

  1. Import the pandas and copy libraries using the following commands:

    import pandas as pd

    import copy

  2. Create a new DataFrame, sales, and use the read_csv function to read the sales.csv file into it:

    sales = pd.read_csv("sales.csv")

    Note

    Make sure you change the path (emboldened) to the CSV file based on its location on your system. If you're running the Jupyter notebook from the same directory where the CSV file is stored, you can run the preceding code without any modification.

  3. Now, examine whether your data is properly loaded by checking the first five rows in the DataFrame. Do this using the head() command:

    sales.head()

    You should get the following output:

    Figure 1.60: First five rows of the DataFrame

  4. Look at the data types of sales using the following command:

    sales.dtypes

    You should get the following output:

    Figure 1.61: Looking at the data type of columns of sales.csv

    You can...

2. Data Exploration and Visualization

Activity 2.01: Analyzing Advertisements

Solution:

Perform the following steps to complete this activity:

  1. Import pandas and seaborn using the following code:

    import pandas as pd

    import seaborn as sns

    import matplotlib.pyplot as plt

    sns.set()

  2. Load the Advertising.csv file into a DataFrame called ads and examine if your data is properly loaded by checking the first few values in the DataFrame by using the head() command:

    ads = pd.read_csv("Advertising.csv", index_col = 'Date')

    ads.head()

    The output should be as follows:

    Figure 2.65: First five rows of the DataFrame ads

  3. Look at the memory usage and other internal information about the DataFrame using the following command:

    ads.info

    This gives the following output:

    Figure 2.66: The result of ads.info()

    From the preceding figure, you can see that you have five columns with 200 data points in each and no missing values.

  4. Use describe() function to view basic statistical details...

3. Unsupervised Learning and Customer Segmentation

Activity 3.01: Bank Customer Segmentation for Loan Campaign

Solution:

  1. Import the necessary libraries for data processing, visualization, and clustering using the following code:

    import numpy as np, pandas as pd

    import matplotlib.pyplot as plt, seaborn as sns

    from sklearn.preprocessing import StandardScaler

    from sklearn.cluster import KMeans

  2. Load the data into a pandas DataFrame and display the top five rows:

    bank0 = pd.read_csv("Bank_Personal_Loan_Modelling-1.csv")

    bank0.head()

    Note

    Make sure you change the path (highlighted) to the CSV file based on its location on your system. If you're running the Jupyter notebook from the same directory where the CSV file is stored, you can run the preceding code without any modification.

    The first five rows get displayed as follows:

    Figure 3.31: First five rows of the dataset

    You can see that you have data about customer demographics such as Age, Experience, Family, and Education...

4. Evaluating and Choosing the Best Segmentation Approach

Activity 4.01: Optimizing a Luxury Clothing Brand's Marketing Campaign Using Clustering

Solution:

  1. Import the libraries required for DataFrame handling and plotting (pandas, numpy, matplotlib). Read in the data from the file 'Clothing_Customers.csv' into a DataFrame and print the top 5 rows to understand it better.

    import numpy as np, pandas as pd

    import matplotlib.pyplot as plt, seaborn as sns

    data0 = pd.read_csv('Clothing_Customers.csv')

    data0.head()

    Note

    Make sure you place the CSV file in the same directory from where you are running the Jupyter Notebook. If not, make sure you change the path (emboldened) to match the one where you have stored the file.

    The result should be the table below:

    Figure 4.24: Top 5 records of the data

    The data contains the customers' income, age, days since their last purchase, and their annual spending. All these will be used to perform segmentation.

  2. Standardize...

5. Predicting Customer Revenue Using Linear Regression

Activity 5.01: Examining the Relationship between Store Location and Revenue

Solution:

  1. Import the pandas, pyplot from matplotlib, and seaborn libraries. Read the data into a DataFrame called df and print the top five records using the following code:

    import pandas as pd

    import matplotlib.pyplot as plt, seaborn as sns

    df = pd.read_csv('location_rev.csv')

    df.head()

    Note

    Make sure you change the path (highlighted) to the CSV file based on its location on your system. If you're running the Jupyter notebook from the same directory where the CSV file is stored, you can run the preceding code without any modification.

    The data should appear as follows:

    Figure 5.35: The first five rows of the location revenue data

    You see that, as described earlier, you have the revenue of the store, its age, along with various fields about the location of the store. From the top five records, you get a sense of the order of the values...

6. More Tools and Techniques for Evaluating Regression Models

Activity 6.01: Finding Important Variables for Predicting Responses to a Marketing Offer

Solution:

Perform the following steps to achieve the aim of this activity:

  1. Import pandas, read in the data from offer_responses.csv, and use the head function to view the first five rows of the data:

    import pandas as pd

    df = pd.read_csv('offer_responses.csv')

    df.head()

    Note

    Make sure you change the path (emboldened) to the CSV file based on its location on your system. If you're running the Jupyter notebook from the same directory where the CSV file is stored, you can run the preceding code without any modifications.

    You should get the following output:

    Figure 6.22: The first five rows of the offer_responses data

  2. Extract the target variable (y) and the predictor variable (X) from the data:

    X = df[['offer_quality',\

            'offer_discount',\

      &...

7. Supervised Learning: Predicting Customer Churn

Activity 7.01: Performing the OSE technique from OSEMN

Solution:

  1. Import the necessary libraries:

    # Removes Warnings

    import warnings

    warnings.filterwarnings('ignore')

    #import the necessary packages

    import pandas as pd

    import numpy as np

    import matplotlib.pyplot as plt

    import seaborn as sns

  2. Download the dataset from https://packt.link/80blQ and save it as Telco_Churn_Data.csv. Make sure to run the notebook from the same folder as the dataset.
  3. Create a DataFrame called data and read the dataset using pandas' read.csv method. Look at the first few rows of the DataFrame:

    data= pd.read_csv(r'Telco_Churn_Data.csv')

    data.head(5)

    Note

    Make sure you change the path (emboldened in the preceding code snippet) to the CSV file based on its location on your system. If you're running the Jupyter notebook from the same directory where the CSV file is stored, you can run the preceding code without any modification.

    The...

8. Fine-Tuning Classification Algorithms

Activity 8.01: Implementing Different Classification Algorithms

Solution:

  1. Import the logistic regression library:

    from sklearn.linear_model import LogisticRegression

  2. Fit the model:

    clf_logistic = LogisticRegression(random_state=0,solver='lbfgs')\

                   .fit(X_train[top7_features], y_train)

    clf_logistic

    The preceding code will give the following output:

    LogisticRegression(random_state=0)

  3. Score the model:

    clf_logistic.score(X_test[top7_features], y_test)

    You will get the following output: 0.7454031117397454.

    This shows that the logistic regression model is getting an accuracy of 74.5%, which is a mediocre accuracy but serves as a good estimate of the minimum accuracy you can expect.

  4. Import the svm library:

    from sklearn import svm

  5. Scale the training and testing data as follows:

    from sklearn.preprocessing import MinMaxScaler

    scaling = MinMaxScaler...

9. Multiclass Classification Algorithms

Activity 9.01: Performing Multiclass Classification and Evaluating Performance

Solution:

  1. Import the required libraries:

    import pandas as pd

    import numpy as np

    from sklearn.ensemble import RandomForestClassifier

    from sklearn.model_selection import train_test_split

    from sklearn.metrics import classification_report,\

                                confusion_matrix,\

                                accuracy_score

    from sklearn import metrics

    from sklearn.metrics import precision_recall_fscore_support

    import matplotlib.pyplot as plt

    import seaborn as sns

  2. Load the marketing data into a DataFrame named data and look at the first five rows of the DataFrame using the following code:

    data...

lock icon The rest of the chapter is locked
arrow left Previous Chapter
You have been reading a chapter from
Data Science for Marketing Analytics - Second Edition
Published in: Sep 2021 Publisher: Packt ISBN-13: 9781800560475
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}