Reader small image

You're reading from  Practical Big Data Analytics

Product typeBook
Published inJan 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781783554393
Edition1st Edition
Languages
Concepts
Right arrow
Author (1)
Nataraj Dasgupta
Nataraj Dasgupta
author image
Nataraj Dasgupta

Nataraj Dasgupta is the vice president of advanced analytics at RxDataScience Inc. Nataraj has been in the IT industry for more than 19 years, and has worked in the technical and analytics divisions of Philip Morris, IBM, UBS Investment Bank, and Purdue Pharma. At Purdue Pharma, Nataraj led the data science division, where he developed the company's award-winning big data and machine learning platform. Prior to Purdue, at UBS, he held the role of Associate Director, working with high-frequency and algorithmic trading technologies in the foreign exchange trading division of the bank.
Read more about Nataraj Dasgupta

Right arrow

Common terminologies in machine learning


In machine learning, you'll often hear the terms features, predictors, and dependent variables. They are all one and the same. They all refer to the variables that are used to predict an outcome. In our previous example of cars, the variables cyl (Cylinder), hp (Horsepower), wt (Weight), and gear (Gear) are the predictors and mpg (Miles Per Gallon) is the outcome.

In simpler terms, taking the example of a spreadsheet, the names of the columns are, in essence, known as features, predictors, and dependent variables. As an example, if we were given a dataset of toll booth charges and were tasked with predicting the amount charged based on the time of day and other factors, a hypothetical example could be as follows:

In this spreadsheet, the columns date, time, agency, type, prepaid, and rate are the features or predictors, whereas, the column amount is our outcome or dependent variable (what we are predicting).

The value of amount depends on the value of...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Practical Big Data Analytics
Published in: Jan 2018Publisher: PacktISBN-13: 9781783554393

Author (1)

author image
Nataraj Dasgupta

Nataraj Dasgupta is the vice president of advanced analytics at RxDataScience Inc. Nataraj has been in the IT industry for more than 19 years, and has worked in the technical and analytics divisions of Philip Morris, IBM, UBS Investment Bank, and Purdue Pharma. At Purdue Pharma, Nataraj led the data science division, where he developed the company's award-winning big data and machine learning platform. Prior to Purdue, at UBS, he held the role of Associate Director, working with high-frequency and algorithmic trading technologies in the foreign exchange trading division of the bank.
Read more about Nataraj Dasgupta