Reader small image

You're reading from  Python for Finance Cookbook - Second Edition

Product typeBook
Published inDec 2022
PublisherPackt
ISBN-139781803243191
Edition2nd Edition
Right arrow
Author (1)
Eryk Lewinson
Eryk Lewinson
author image
Eryk Lewinson

Eryk Lewinson received his master's degree in Quantitative Finance from Erasmus University Rotterdam. In his professional career, he has gained experience in the practical application of data science methods while working in risk management and data science departments of two "big 4" companies, a Dutch neo-broker and most recently the Netherlands' largest online retailer. Outside of work, he has written over a hundred articles about topics related to data science, which have been viewed more than 3 million times. In his free time, he enjoys playing video games, reading books, and traveling with his girlfriend.
Read more about Eryk Lewinson

Right arrow

Investigating different approaches to handling imbalanced data

A very common issue when working with classification tasks is that of class imbalance, that is, when one class is highly outnumbered in comparison to the second one (this can also be extended to multi-class cases). In general, we are dealing with imbalance when the ratio of the two classes is not 1:1. In some cases, a delicate imbalance is not that big of a problem, but there are industries/problems in which we can encounter ratios of 100:1, 1000:1, or even more extreme.

Dealing with highly imbalanced classes can result in the poor performance of ML models. That is because most of the algorithms implicitly assume balanced distribution of classes. They do so by aiming to minimize the overall prediction error, to which the minority class by definition contributes very little. As a result, classifiers trained on imbalanced data are biased toward the majority class.

One of the potential solutions to dealing with class...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Python for Finance Cookbook - Second Edition
Published in: Dec 2022Publisher: PacktISBN-13: 9781803243191

Author (1)

author image
Eryk Lewinson

Eryk Lewinson received his master's degree in Quantitative Finance from Erasmus University Rotterdam. In his professional career, he has gained experience in the practical application of data science methods while working in risk management and data science departments of two "big 4" companies, a Dutch neo-broker and most recently the Netherlands' largest online retailer. Outside of work, he has written over a hundred articles about topics related to data science, which have been viewed more than 3 million times. In his free time, he enjoys playing video games, reading books, and traveling with his girlfriend.
Read more about Eryk Lewinson