Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Mastering Predictive Analytics with R

You're reading from  Mastering Predictive Analytics with R

Product type Book
Published in Jun 2015
Publisher
ISBN-13 9781783982806
Pages 414 pages
Edition 1st Edition
Languages
Authors (2):
Rui Miguel Forte Rui Miguel Forte
Profile icon Rui Miguel Forte
Rui Miguel Forte Rui Miguel Forte
Profile icon Rui Miguel Forte
View More author details

Table of Contents (19) Chapters

Mastering Predictive Analytics with R
Credits
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Gearing Up for Predictive Modeling Linear Regression Logistic Regression Neural Networks Support Vector Machines Tree-based Methods Ensemble Methods Probabilistic Graphical Models Time Series Analysis Topic Modeling Recommendation Systems Index

Bagging


The focus of this chapter is on combining the results from different models in order to produce a single model that will outperform individual models on their own. Bagging is essentially an intuitive procedure for combining multiple models trained on the same data set, by using majority voting for classification models and average value for regression models. We'll present this procedure for the classification case, and later show how this is easily extended to handle regression models.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}

Bagging procedure for binary classification

Inputs:

  • data: The input data frame containing the input features and a column with the binary output label

  • M: An integer, representing the number of models that we want to train

Output:

  • models: A set of Μ trained binary classifier models

Method:

1. Create a random sample of size n, where n is the number of observations in the original data set, with replacement. This means that some of the observations from the original training set will be repeated and some...