Reader small image

You're reading from  The Machine Learning Workshop - Second Edition

Product typeBook
Published inJul 2020
Reading LevelIntermediate
PublisherPackt
ISBN-139781839219061
Edition2nd Edition
Languages
Tools
Right arrow
Author (1)
Hyatt Saleh
Hyatt Saleh
author image
Hyatt Saleh

Hyatt Saleh discovered the importance of data analysis for understanding and solving real-life problems after graduating from college as a business administrator. Since then, as a self-taught person, she not only works as a machine learning freelancer for many companies globally, but has also founded an artificial intelligence company that aims to optimize everyday processes. She has also authored Machine Learning Fundamentals, by Packt Publishing.
Read more about Hyatt Saleh

Right arrow

2. Unsupervised Learning – Real-Life Applications

Activity 2.01: Using Data Visualization to Aid the Pre-processing Process

Solution:

  1. Import all the required elements to load the dataset and pre-process it:
    import pandas as pd
    import matplotlib.pyplot as plt
    import numpy as np
  2. Load the previously downloaded dataset by using pandas' read_csv() function. Store the dataset in a pandas DataFrame named data:
    data = pd.read_csv("wholesale_customers_data.csv")
  3. Check for missing values in your DataFrame. Using the isnull() function plus the sum() function, count the missing values of the entire dataset at once:
    data.isnull().sum()

    The output is as follows:

    Channel             0
    Region              0
    Fresh               0
    Milk                0
    Grocery             0
    Frozen              0
    Detergents_Paper    0
    Delicassen          0
    dtype: int64

    As you can see from the preceding screenshot, there are no missing values in the dataset.

  4. Check for outliers...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
The Machine Learning Workshop - Second Edition
Published in: Jul 2020Publisher: PacktISBN-13: 9781839219061

Author (1)

author image
Hyatt Saleh

Hyatt Saleh discovered the importance of data analysis for understanding and solving real-life problems after graduating from college as a business administrator. Since then, as a self-taught person, she not only works as a machine learning freelancer for many companies globally, but has also founded an artificial intelligence company that aims to optimize everyday processes. She has also authored Machine Learning Fundamentals, by Packt Publishing.
Read more about Hyatt Saleh