Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
Machine Learning Quick Reference

You're reading from   Machine Learning Quick Reference Quick and essential machine learning hacks for training smart data models

Arrow left icon
Product type Paperback
Published in Jan 2019
Publisher Packt
ISBN-13 9781788830577
Length 294 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
 Kumar Kumar
Author Profile Icon Kumar
Kumar
Arrow right icon
View More author details
Toc

Table of Contents (13) Chapters Close

Preface 1. Quantifying Learning Algorithms FREE CHAPTER 2. Evaluating Kernel Learning 3. Performance in Ensemble Learning 4. Training Neural Networks 5. Time Series Analysis 6. Natural Language Processing 7. Temporal and Sequential Pattern Discovery 8. Probabilistic Graphical Models 9. Selected Topics in Deep Learning 10. Causal Inference 11. Advanced Methods 12. Other Books You May Enjoy

Regularization

We have now got a fair understanding of what overfitting means when it comes to machine learning modeling. Just to reiterate, when the model learns the noise that has crept into the data, it is trying to learn the patterns that take place due to random chance, and so overfitting occurs. Due to this phenomenon, the model's generalization runs into jeopardy and it performs poorly on unseen data. As a result of that, the accuracy of the model takes a nosedive.

Can we combat this kind of phenomenon? The answer is yes. Regularization comes to the rescue. Let's figure out what it can offer and how it works.

Regularization is a technique that enables the model to not become complex to avoid overfitting.

Let's take a look at the following regression equation:

The loss function for this is as follows:

 

The loss function would help in getting the coefficients adjusted and retrieving the optimal one. In the case of noise in the training data, the coefficients wouldn't generalize well and would run into overfitting. Regularization helps get rid of this by making these estimates or coefficients drop toward 0.

Now, we will cover two types of regularization. In later chapters, the other types will be covered.

Ridge regression (L2)

Due to ridge regression, we need to make some changes to the loss function. The original loss function gets added by a shrinkage component:

Now, this modified loss function needs to be minimized to adjust the estimates or coefficients. Here, the lambda is tuning the parameter that regularizes the loss function. That is, it decides how much it should penalize the flexibility of the model. The flexibility of the model is dependent on the coefficients. If the coefficients of the model go up, the flexibility also goes up, which isn't a good sign for our model. Likewise, as the coefficients go down, the flexibility is restricted and the model starts to perform better. The shrinkage of each estimated parameter makes the model better here, and this is what ridge regression does. When lambda keeps going higher and higher, that is, λ → ∞, the penalty component rises, and the estimates start shrinking. However, when λ → 0, the penalty component decreases and starts to become an ordinary least square (OLS) for estimating unknown parameters in a linear regression.

Least absolute shrinkage and selection operator 

The least absolute shrinkage and selection operator (LASSO) is also called L1. In this case, the preceding penalty parameter is replaced by |βj|:

By minimizing the preceding function, the coefficients are found and adjusted. In this scenario, as lambda becomes larger, λ → ∞, the penalty component rises, and so estimates start shrinking and become 0 (it doesn't happen in the case of ridge regression; rather, it would just be close to 0).

Visually different images
CONTINUE READING
83
Tech Concepts
36
Programming languages
73
Tech Tools
Icon Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.
Icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Icon 50+ new titles added per month and exclusive early access to books as they are being written.
Machine Learning Quick Reference
You have been reading a chapter from
Machine Learning Quick Reference
Published in: Jan 2019
Publisher: Packt
ISBN-13: 9781788830577
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Modal Close icon
Modal Close icon