Reader small image

You're reading from  Python Machine Learning

Product typeBook
Published inSep 2015
Reading LevelIntermediate
PublisherPackt
ISBN-139781783555130
Edition1st Edition
Languages
Right arrow
Author (1)
Sebastian Raschka
Sebastian Raschka
author image
Sebastian Raschka

Sebastian Raschka is an Assistant Professor of Statistics at the University of Wisconsin-Madison focusing on machine learning and deep learning research. As Lead AI Educator at Grid AI, Sebastian plans to continue following his passion for helping people get into machine learning and artificial intelligence.
Read more about Sebastian Raschka

Right arrow

Handling categorical data


So far, we have only been working with numerical values. However, it is not uncommon that real-world datasets contain one or more categorical feature columns. When we are talking about categorical data, we have to further distinguish between nominal and ordinal features. Ordinal features can be understood as categorical values that can be sorted or ordered. For example, T-shirt size would be an ordinal feature, because we can define an order XL > L > M. In contrast, nominal features don't imply any order and, to continue with the previous example, we could think of T-shirt color as a nominal feature since it typically doesn't make sense to say that, for example, red is larger than blue.

Before we explore different techniques to handle such categorical data, let's create a new data frame to illustrate the problem:

>>> import pandas as pd
>>> df = pd.DataFrame([
...            ['green', 'M', 10.1, 'class1'], 
...            ['red', 'L', 13.5...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Python Machine Learning
Published in: Sep 2015Publisher: PacktISBN-13: 9781783555130

Author (1)

author image
Sebastian Raschka

Sebastian Raschka is an Assistant Professor of Statistics at the University of Wisconsin-Madison focusing on machine learning and deep learning research. As Lead AI Educator at Grid AI, Sebastian plans to continue following his passion for helping people get into machine learning and artificial intelligence.
Read more about Sebastian Raschka