Reader small image

You're reading from  Python Data Science Essentials. - Third Edition

Product typeBook
Published inSep 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781789537864
Edition3rd Edition
Languages
Concepts
Right arrow
Author (1)
Alberto Boschetti
Alberto Boschetti
author image
Alberto Boschetti

Alberto Boschetti is a data scientist with expertise in signal processing and statistics. He holds a Ph.D. in telecommunication engineering and currently lives and works in London. In his work projects, he faces challenges ranging from natural language processing (NLP) and behavioral analysis to machine learning and distributed processing. He is very passionate about his job and always tries to stay updated about the latest developments in data science technologies, attending meet-ups, conferences, and other events.
Read more about Alberto Boschetti

Right arrow

Introducing EDA

Exploratory data analysis (EDA), or data exploration, is the first step in the data science process. John Tukey coined this term in 1977 when he first wrote his, book Exploratory Data Analysis, emphasizing the importance of EDA. EDA is required to understand the dataset better, check its features and its shape, validate some first hypothesis that you have in mind, and get a preliminary idea about the next step that you want to pursue in subsequent subsequent data science tasks.

In this section, you will work on the Iris dataset, which was already used in the previous chapter. First, let's load the dataset:

In: import pandas as pd
iris_filename = 'datasets-uci-iris.csv'
iris = pd.read_csv(iris_filename, header=None,
names= ['sepal_length', 'sepal_width',
'petal_length', 'petal_width...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Python Data Science Essentials. - Third Edition
Published in: Sep 2018Publisher: PacktISBN-13: 9781789537864

Author (1)

author image
Alberto Boschetti

Alberto Boschetti is a data scientist with expertise in signal processing and statistics. He holds a Ph.D. in telecommunication engineering and currently lives and works in London. In his work projects, he faces challenges ranging from natural language processing (NLP) and behavioral analysis to machine learning and distributed processing. He is very passionate about his job and always tries to stay updated about the latest developments in data science technologies, attending meet-ups, conferences, and other events.
Read more about Alberto Boschetti