Reader small image

You're reading from  Building Data Science Solutions with Anaconda

Product typeBook
Published inMay 2022
PublisherPackt
ISBN-139781800568785
Edition1st Edition
Concepts
Right arrow
Author (1)
Dan Meador
Dan Meador
author image
Dan Meador

Dan Meador is an Engineering Manager at Anaconda and is the creator of Conda as well as a champion of open source at Anaconda. With a history of engineering and client facing roles, he has the ability to jump into any position. He has a track record of delivering as a leader and a follower in companies from the Fortune 10 to startups.
Read more about Dan Meador

Right arrow

Cleaning data with pandas

One of the most important aspects that come into play when working with data is ensuring that it's in the correct format that you need. Along with getting enough data, this might be the most vital component to training an accurate model. In this section, we're going to walk through the steps of importing a CSV file and then seeing how to analyze and clean it to make sure that it's prepped for us.

The example that we are going to look at is the data for various US university majors and how it relates to pay. Having a general sense of the domain we are looking into is critical, and this is an area that you might already have a grasp of. This dataset is provided by the excellent FiveThirtyEight site, and more information can be found here: https://github.com/fivethirtyeight/data/tree/master/college-majors.

Our goal is to see whether we can figure out whether we should have chosen another major using this data. We might even find out that...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Building Data Science Solutions with Anaconda
Published in: May 2022Publisher: PacktISBN-13: 9781800568785

Author (1)

author image
Dan Meador

Dan Meador is an Engineering Manager at Anaconda and is the creator of Conda as well as a champion of open source at Anaconda. With a history of engineering and client facing roles, he has the ability to jump into any position. He has a track record of delivering as a leader and a follower in companies from the Fortune 10 to startups.
Read more about Dan Meador