Chapter 1: Introduction to Data Science and Data Preprocessing
Activity 1: Pre-Processing Using the Bank Marketing Subscription Dataset
Solution
Let's perform various pre-processing tasks on the Bank Marketing Subscription dataset. We'll also be splitting the dataset into training and testing data. Follow these steps to complete this activity:
- Open a Jupyter notebook and add a new cell to import the pandas library and load the dataset into a pandas dataframe. To do so, you first need to import the library, and then use the pd.read_csv() function, as shown here:
import pandas as pd
Link = 'https://github.com/TrainingByPackt/Data-Science-with-Python/blob/master/Chapter01/Data/Banking_Marketing.csv'
#reading the data into the dataframe into the object data
df = pd.read_csv(Link, header=0)
- To find the number of rows and columns in the dataset, add the following code:
#Finding number of rows and columns
print("Number of rows and columns : ",df.shape)
The preceding...