Reader small image

You're reading from  Python for Finance Cookbook

Product typeBook
Published inJan 2020
Reading LevelIntermediate
PublisherPackt
ISBN-139781789618518
Edition1st Edition
Languages
Right arrow
Author (1)
Eryk Lewinson
Eryk Lewinson
author image
Eryk Lewinson

Eryk Lewinson received his master's degree in Quantitative Finance from Erasmus University Rotterdam. In his professional career, he has gained experience in the practical application of data science methods while working in risk management and data science departments of two "big 4" companies, a Dutch neo-broker and most recently the Netherlands' largest online retailer. Outside of work, he has written over a hundred articles about topics related to data science, which have been viewed more than 3 million times. In his free time, he enjoys playing video games, reading books, and traveling with his girlfriend.
Read more about Eryk Lewinson

Right arrow

Exploratory data analysis

The second step, after loading the data, is to carry out Exploratory Data Analysis (EDA). By doing this, we get to know the data we are supposed to work with. Some insights we try to gather are:

  • What kind of data do we actually have, and how should we treat different types?
  • What is the distribution of the variables?
    • Are there outliers in the data, and how can we treat them?
    • Are any transformations required? For example, some models work better with (or require) normally distributed variables, so we might want to use techniques such as log transformation.
    • Does the distribution vary per group (for example, gender or education level)?
  • Do we have cases of missing data? How frequent are these, and in which variables?
  • Is there a linear relationship between some variables (correlation)?
  • Can we create new features using the existing set of variables? An example...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Python for Finance Cookbook
Published in: Jan 2020Publisher: PacktISBN-13: 9781789618518

Author (1)

author image
Eryk Lewinson

Eryk Lewinson received his master's degree in Quantitative Finance from Erasmus University Rotterdam. In his professional career, he has gained experience in the practical application of data science methods while working in risk management and data science departments of two "big 4" companies, a Dutch neo-broker and most recently the Netherlands' largest online retailer. Outside of work, he has written over a hundred articles about topics related to data science, which have been viewed more than 3 million times. In his free time, he enjoys playing video games, reading books, and traveling with his girlfriend.
Read more about Eryk Lewinson