Reader small image

You're reading from  Julia for Data Science

Product typeBook
Published inSep 2016
Reading LevelBeginner
PublisherPackt
ISBN-139781785289699
Edition1st Edition
Languages
Concepts
Right arrow
Author (1)
Anshul Joshi
Anshul Joshi
author image
Anshul Joshi

Anshul Joshi is a data scientist with experience in recommendation systems, predictive modeling, neural networks, and high performance computing. His research interests encompass deep learning, artificial intelligence, and computational physics. Most of the time, he can be caught exploring GitHub or trying anything new he can get his hands on. You can also follow his personal blog.
Read more about Anshul Joshi

Right arrow

Sampling


In the previous example, we spoke about calculating the mean height of 1,000 people out of the 10 million people living in New Delhi. While gathering the data of these 10 million people, let's say we started from a particular age or community, or in any sequential manner. Now, if we take 1,000 people who are consecutive in the dataset, there is a high probability that they would have similarities among them. This similarity would not give us the actual highlight of the dataset that we are trying to achieve. So, taking a small chunk of consecutive data points from the dataset wouldn't give us the insight that we want to gain. To overcome this, we use sampling.

Sampling is a technique to randomly select data from the given dataset such that they are not related to each other, and therefore we can generalize the results that we generate on this selected data over the complete dataset. Sampling is done over a population.

Population

A population in statistics refers to the set of all the...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Julia for Data Science
Published in: Sep 2016Publisher: PacktISBN-13: 9781785289699

Author (1)

author image
Anshul Joshi

Anshul Joshi is a data scientist with experience in recommendation systems, predictive modeling, neural networks, and high performance computing. His research interests encompass deep learning, artificial intelligence, and computational physics. Most of the time, he can be caught exploring GitHub or trying anything new he can get his hands on. You can also follow his personal blog.
Read more about Anshul Joshi