Reader small image

You're reading from  scikit-learn Cookbook - Second Edition

Product typeBook
Published inNov 2017
Reading LevelIntermediate
PublisherPackt
ISBN-139781787286382
Edition2nd Edition
Languages
Right arrow
Author (1)
Trent Hauck
Trent Hauck
author image
Trent Hauck

Trent Hauck is a data scientist living and working in the Seattle area. He grew up in Wichita, Kansas and received his undergraduate and graduate degrees from the University of Kansas. He is the author of the book Instant Data Intensive Apps with pandas How-to, Packt Publishing—a book that can get you up to speed quickly with pandas and other associated technologies.
Read more about Trent Hauck

Right arrow

 Bagging regression with nearest neighbors

Bagging is an additional ensemble type that, interestingly, does not necessarily involve trees. It builds several instances of a base estimator acting on random subsets of the first training set. In this section, we try k-nearest neighbors (KNN) as the base estimator.

Pragmatically, bagging estimators are great for reducing the variance of a complex base estimator, for example, a decision tree with many levels. On the other hand, boosting reduces the bias of weak models, such as decision trees of very few levels, or linear models.

To try out bagging, we will find the best parameters, a hyperparameter search, using scikit-learn's random grid search. As we have done previously, we will go through the following process:

  1. Figure out which parameters to optimize in the algorithm (these are the parameters researchers view as the best...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
scikit-learn Cookbook - Second Edition
Published in: Nov 2017Publisher: PacktISBN-13: 9781787286382

Author (1)

author image
Trent Hauck

Trent Hauck is a data scientist living and working in the Seattle area. He grew up in Wichita, Kansas and received his undergraduate and graduate degrees from the University of Kansas. He is the author of the book Instant Data Intensive Apps with pandas How-to, Packt Publishing—a book that can get you up to speed quickly with pandas and other associated technologies.
Read more about Trent Hauck