PANDAS, CSV FILES, AND MISSING DATA
This section contains several subsections with Python-based code samples that create Pandas data frames and then replace missing values in the data frames. First we’ll look at small CSV files with one column and then we’ll look at small CSV files with two columns. Later we’ll look at skewed CSV files as well as multi-row CSV files.
Single Column CSV Files
Listing 3.1 displays the contents of the CSV file one_char_column1.csv and Listing 3.2 displays the contents of one_char_column1.py that fills in missing values in the CSV file.
Listing 3.1: one_char_column1.csv
gender Male Male NaN Female Male
Listing 3.2: one_char_column1.py
import pandas as pd
df1 = pd.read_csv('one_char_column1.csv')
print("=> initial dataframe contents:")
print(df1)
print()
df = df1.fillna("FEMALE")
print("dataframe after fillna():")
print(df)
print()
Listing 3.2 starts with two import statements and then initializes...