Packt+ | Advance your knowledge in tech

You're reading from Machine Learning with R - Third Edition

Product type Book

Published in Apr 2019

Publisher Packt

ISBN-13 9781788295864

Pages 458 pages

Edition 3rd Edition

Languages

Concepts

Machine Learning

Author (1):

Brett Lantz

Table of Contents (18) Chapters

Machine Learning with R - Third Edition

Contributors

Preface

Other Books You May Enjoy

Leave a review - let other readers know what you think

Introducing Machine Learning

Managing and Understanding Data

Lazy Learning – Classification Using Nearest Neighbors

Probabilistic Learning – Classification Using Naive Bayes

Divide and Conquer – Classification Using Decision Trees and Rules

Forecasting Numeric Data – Regression Methods

Black Box Methods – Neural Networks and Support Vector Machines

Finding Patterns – Market Basket Analysis Using Association Rules

Finding Groups of Data – Clustering with k-means

Evaluating Model Performance

Improving Model Performance

Specialized Machine Learning Topics

Index

Chapter 11. Improving Model Performance

When a sports team falls short of meeting its goal—whether the goal is to obtain an Olympic gold medal, a league championship, or a world record time—it must search for possible improvements. Imagine that you're the team's coach. How would you spend your practice sessions? Perhaps you'd direct the athletes to train harder or train differently in order to maximize every bit of their potential. Or, you might emphasize better teamwork, utilizing the athletes' strengths and weaknesses more smartly.

Now imagine that you're training a world champion machine learning algorithm. Perhaps you hope to compete in data mining competitions such as those posted on Kaggle (http://www.kaggle.com/). Maybe you simply need to improve business results. Where do you begin? Although the context differs, the strategies one uses to improve a sports team performance can also be used to improve the performance of statistical learners.

As the coach, it is your job to find the combination...

Tuning stock models for better performance

Some learning problems are well suited to the stock models presented in previous chapters. In such cases, it may not be necessary to spend much time iterating and refining the model; it may perform well enough as it is. On the other hand, some problems are inherently more difficult. The underlying concepts to be learned may be extremely complex, requiring an understanding of many subtle relationships, or the problem may be affected by random variation, making it difficult to define the signal within the noise.

Developing models that perform extremely well on difficult problems is every bit an art as it is a science. Sometimes a bit of intuition is helpful when trying to identify areas where performance can be improved. In other cases, finding improvements will require a brute-force, trial-and-error approach. Of course, the process of searching numerous possible improvements can be aided by the use of automated programs.

In Chapter 5, Divide and Conquer...

Improving model performance with meta-learning

As an alternative to increasing the performance of a single model, it is possible to combine several models to form a powerful team. Just as the best sports teams have players with complementary rather than overlapping skillsets, some of the best machine learning algorithms utilize teams of complementary models. Since a model brings a unique bias to a learning task, it may readily learn one subset of examples, but have trouble with another. Therefore, by intelligently using the talents of several diverse team members, it is possible to create a strong team of multiple weak learners.

This technique of combining and managing the predictions of multiple models falls into a wider set of meta-learning methods, which are techniques that involve learning how to learn. This includes anything from simple algorithms that gradually improve performance by iterating over design decisions—for instance, the automated parameter tuning used earlier in this chapter...

Summary

After reading this chapter, you should now know the base techniques that are used to win data mining and machine learning competitions. Automated tuning methods can assist with squeezing every bit of performance out of a single model. On the other hand, performance gains are also possible by creating groups of machine learning models that work together.

Although this chapter was designed to help you prepare competition-ready models, note that your fellow competitors have access to the same techniques. You won't be able to get away with stagnancy; therefore, continue to add proprietary methods to your bag of tricks. Perhaps you can bring unique subject-matter expertise to the table, or perhaps your strengths include an eye for detail in data preparation. In any case, practice makes perfect, so take advantage of open competitions to test, evaluate, and improve your own machine learning skillset.

In the next chapter—the last in this book—we'll take a bird's-eye look at ways to apply machine...