2. Building Blocks of Neural Networks
Activity 2.01: Performing Data Preparation
Solution
- Import the required libraries:
import pandas as pd
- Using pandas, load the
.csvfile:data = pd.read_csv("YearPredictionMSD.csv", nrows=50000) data.head()Note
To avoid memory limitations, use the
nrowsargument when reading the text file in order to read a smaller section of the entire dataset. In the preceding example, we are reading the first 50,000 rows.The output is as follows:
Figure 2.33: YearPredictionMSD.csv
- Verify whether any qualitative data is present in the dataset:
cols = data.columns num_cols = data._get_numeric_data().columns list(set(cols) - set(num_cols))
The output should be an empty list, meaning there are no qualitative features.
- Check for missing values.
If you add an additional
sum()function to the line of code that was previously used for this purpose, you will get the sum of missing values in the entire dataset, without discriminating by column...