CONVERTING CATEGORICAL DATA TO NUMERIC DATA
One common task (especially in machine learning) involves converting a feature containing character data into a feature that contains numeric data. Listing B.8 shows the contents of cat2numeric.py that illustrate how to replace a text field with a corresponding numeric field.
Listing B.8: cat2numeric.py
import pandas as pd
import numpy as np
df = pd.read_csv('sometext.csv', delimiter='\t')
print("=> First five rows (before):")
print(df.head(5))
print("-------------------------")
print()
# map ham/spam to 0/1 values:
df['type'] = df['type'].map( {'ham':0 , 'spam':1} )
print("=> First five rows (after):")
print(df.head(5))
print("-------------------------")
Listing B.8 initializes the data frame df with the contents of the csv file sometext.csv, and then displays the contents of the first five rows by invoking df.head(5), which is also...