Reader small image

You're reading from  Python for Finance Cookbook

Product typeBook
Published inJan 2020
Reading LevelIntermediate
PublisherPackt
ISBN-139781789618518
Edition1st Edition
Languages
Right arrow
Author (1)
Eryk Lewinson
Eryk Lewinson
author image
Eryk Lewinson

Eryk Lewinson received his master's degree in Quantitative Finance from Erasmus University Rotterdam. In his professional career, he has gained experience in the practical application of data science methods while working in risk management and data science departments of two "big 4" companies, a Dutch neo-broker and most recently the Netherlands' largest online retailer. Outside of work, he has written over a hundred articles about topics related to data science, which have been viewed more than 3 million times. In his free time, he enjoys playing video games, reading books, and traveling with his girlfriend.
Read more about Eryk Lewinson

Right arrow

Encoding categorical variables

In the previous recipes, we have seen that some features are categorical variables (originally represented as either object or category data types). However, most machine learning algorithms work exclusively with numeric data. That is why we need to encode categorical features into a representation compatible with the models.

In this recipe, we cover some popular encoding approaches:

  • Label encoding
  • One-hot encoding

In label encoding, we replace the categorical value with a numeric value between 0 and # of classes - 1—for example, with three distinct classes, we use {0, 1, 2}.

This is already very similar to the outcome of converting to the category class in pandas . We can access the codes of the categories by running df_cat.education.cat.codes. Additionally, we can recover the mapping by running dict(zip(df_cat.education.cat.codes, df_cat...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Python for Finance Cookbook
Published in: Jan 2020Publisher: PacktISBN-13: 9781789618518

Author (1)

author image
Eryk Lewinson

Eryk Lewinson received his master's degree in Quantitative Finance from Erasmus University Rotterdam. In his professional career, he has gained experience in the practical application of data science methods while working in risk management and data science departments of two "big 4" companies, a Dutch neo-broker and most recently the Netherlands' largest online retailer. Outside of work, he has written over a hundred articles about topics related to data science, which have been viewed more than 3 million times. In his free time, he enjoys playing video games, reading books, and traveling with his girlfriend.
Read more about Eryk Lewinson