Reader small image

You're reading from  Learning Bayesian Models with R

Product typeBook
Published inOct 2015
Reading LevelBeginner
PublisherPackt
ISBN-139781783987603
Edition1st Edition
Languages
Right arrow
Author (1)
Hari Manassery Koduvely
Hari Manassery Koduvely
author image
Hari Manassery Koduvely

Dr. Hari M. Koduvely is an experienced data scientist working at the Samsung R&D Institute in Bangalore, India. He has a PhD in statistical physics from the Tata Institute of Fundamental Research, Mumbai, India, and post-doctoral experience from the Weizmann Institute, Israel, and Georgia Tech, USA. Prior to joining Samsung, the author has worked for Amazon and Infosys Technologies, developing machine learning-based applications for their products and platforms. He also has several publications on Bayesian inference and its applications in areas such as recommendation systems and predictive health monitoring. His current interest is in developing large-scale machine learning methods, particularly for natural language understanding.
Read more about Hari Manassery Koduvely

Right arrow

Model overfitting and bias-variance tradeoff


The expected loss mentioned in the previous section can be written as a sum of three terms in the case of linear regression using squared loss function, as follows:

Here, Bias is the difference between the true model F(X) and average value of taken over an ensemble of datasets. Bias is a measure of how much the average prediction over all datasets in the ensemble differs from the true regression function F(X). Variance is given by . It is a measure of extent to which the solution for a given dataset varies around the mean over all datasets. Hence, Variance is a measure of how much the function is sensitive to the particular choice of dataset D. The third term Noise, as mentioned earlier, is the expectation of difference between observation and the true regression function, over all the values of X and Y. Putting all these together, we can write the following:

The objective of machine learning is to learn the function from data that minimizes...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Learning Bayesian Models with R
Published in: Oct 2015Publisher: PacktISBN-13: 9781783987603

Author (1)

author image
Hari Manassery Koduvely

Dr. Hari M. Koduvely is an experienced data scientist working at the Samsung R&D Institute in Bangalore, India. He has a PhD in statistical physics from the Tata Institute of Fundamental Research, Mumbai, India, and post-doctoral experience from the Weizmann Institute, Israel, and Georgia Tech, USA. Prior to joining Samsung, the author has worked for Amazon and Infosys Technologies, developing machine learning-based applications for their products and platforms. He also has several publications on Bayesian inference and its applications in areas such as recommendation systems and predictive health monitoring. His current interest is in developing large-scale machine learning methods, particularly for natural language understanding.
Read more about Hari Manassery Koduvely