Chapter 1: Python Machine Learning Toolkit
Activity 1: pandas Functions
Solution
Open a new Jupyter notebook.
Use pandas to load the Titanic dataset:
import pandas as pd df = pd.read_csv('titanic.csv')Use the head() function on the dataset as follows:
# Have a look at the first 5 sample of the data df.head()
The output will be as follows:

Figure 1.65: First five rows
Use the describe function as follows:
df.describe(include='all')
The output will be as follows:

Figure 1.66: Output of describe()
We don't need the Unnamed: 0 column. We can remove the column without using the del command, as follows:
df = df[df.columns[1:]] # Use the columns df.head()
The output will be as follows:

Figure 1.67: First five rows after deleting the Unnamed: 0 column
Compute the mean, standard deviation, minimum, and maximum values for the columns of the DataFrame without using describe:
df.mean() Fare 33.295479 Pclass 2.294882 Age 29.881138 Parch 0.385027 SibSp 0.498854 Survived 0.383838...