Reader small image

You're reading from  R Deep Learning Essentials. - Second Edition

Product typeBook
Published inAug 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781788992893
Edition2nd Edition
Languages
Tools
Right arrow
Authors (2):
Mark Hodnett
Mark Hodnett
author image
Mark Hodnett

Mark Hodnett is a data scientist with over 20 years of industry experience in software development, business intelligence systems, and data science. He has worked in a variety of industries, including CRM systems, retail loyalty, IoT systems, and accountancy. He holds a master's in data science and an MBA. He works in Cork, Ireland, as a senior data scientist with AltViz.
Read more about Mark Hodnett

Joshua F. Wiley
Joshua F. Wiley
author image
Joshua F. Wiley

Joshua F. Wiley is a lecturer at Monash University, conducting quantitative research on sleep, stress, and health. He earned his Ph.D. from the University of California, Los Angeles and completed postdoctoral training in primary care and prevention. In statistics and data science, Joshua focuses on biostatistics and is interested in reproducible research and graphical displays of data and statistical models. He develops or co-develops a number of R packages including Varian, a package to conduct Bayesian scale-location structural equation models, and MplusAutomation, a popular package that links R to the commercial Mplus software.
Read more about Joshua F. Wiley

View More author details
Right arrow

Tuning and Optimizing Models

In the last two chapters, we trained deep learning models for classification, regression, and image recognition tasks. In this chapter, we will discuss some important issues in regard to managing deep learning projects. While this chapter may seem somewhat theoretical, if any of the issues discussed are not correctly managed, it can derail your deep learning project. We will look at how to choose evaluation metrics and how to create an estimate of how well a deep learning model will perform before you begin modeling. Next, we will move onto data distribution and the mistakes often made in splitting data into correct partitions for training. Many machine learning projects fail in production use because the data distribution is different to what the model was trained with. We will look at data augmentation, a valuable method to enhance your model&apos...

Evaluation metrics and evaluating performance

This section will discuss how to set up a deep learning project and what evaluation metrics to select. We will look at how to select evaluation criteria and how to decide when the model is approaching optimal performance. We will also discuss how all deep learning models tend to overfit and how to manage the bias/variance tradeoff. This will give guidelines on what to do when models have low accuracy.

Types of evaluation metric

Different evaluation metrics are used for categorization and regression tasks. For categorization, accuracy is the most commonly used evaluation metric. However, accuracy is only valid if the cost of errors is the same for all classes, which is not always...

Data preparation

Machine learning is about training a model to generalize on the cases it sees so that it can make predictions on unseen data. Therefore, the data used to train the deep learning model should be similar to the data that the model sees in production. However, at an early product stage, you may have little or no data to train a model, so what can you do? For example, a mobile app could include a machine learning model that predicts the subject of image taken by the mobile camera. When the app is being written, there may not be enough data to train the model using a deep learning network. One approach would be to augment the dataset with images from other sources to train the deep learning network. However, you need to know how to manage this and how to deal with the uncertainty it introduces. Another approach is transfer learning, which we will cover in Chapter 11...

Data augmentation

One approach to increasing the accuracy in a model regardless of the amount of data you have is to create artificial examples based on existing data. This is called data augmentation. Data augmentation can also be used at test time to improve prediction accuracy.

Using data augmentation to increase the training data

We are going to apply data augmentation to the MNIST dataset that we used in previous chapters. The code for this section is in Chapter6/explore.Rmd if you want to follow along. In Chapter 5, Image Classification Using Convolutional Neural Networks, we plotted some examples from the MNIST data, so we won't repeat the code again. It is included in the code file, and you can also refer back...

Tuning hyperparameters

All machine learning algorithms have hyper-parameters or settings that can change how they operate. These hyper-parameters can improve the accuracy of a model or reduce the training time. We have seen some of these hyper-parameters in previous chapters, particularly Chapter 3, Deep Learning Fundamentals, where we looked at the hyper-parameters that can be set in the mx.model.FeedForward.create function. The techniques in this section can help us find better values for the hyper-parameters.

Selecting hyper-parameters is not a magic bullet; if the raw data quality is poor or if there is not enough data to support training, then tuning hyper-parameters will only get you so far. In these cases, either acquiring additional variables/features that can be used as predictors and/or additional cases may be required.

...

Use case—using LIME for interpretability

Deep learning models are known to be difficult to interpret. Some approaches to model interpretability, including LIME, allow us to gain some insights into how the model came to its conclusions. Before we demonstrate LIME, I will show how different data distributions and / or data leakage can cause problems when building deep learning models. We will reuse the deep learning churn model from Chapter 4, Training Deep Prediction Models, but we are going to make one change to the data. We are going to introduce a bad variable that is highly correlated to the y value. We will only include this variable in the data used to train and evaluate the model. A separate test set from the original data will be kept to represent the data the model will see in production, this will not have the bad variable in it. The creation of this bad variable...

Summary

This chapter covered topics that are critical to success in deep learning projects. These included the different types of evaluation metric that can be used to evaluate the model. We looked at some issues that can come up in data preparation, including if you only have a small amount of data to train on and how to create different splits in the data, that is, how to create proper train, test, and validation datasets. We looked at two important issues that can cause the model to perform poorly in production, different data distributions, and data leakage. We saw how data augmentation can be used to improve an existing model by creating artificial data and looked at tuning hyperparameters in order to improve the performance of a deep learning model. We closed the chapter by examining a use case where we simulated a problem with different data distributions/data leakage and...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
R Deep Learning Essentials. - Second Edition
Published in: Aug 2018Publisher: PacktISBN-13: 9781788992893
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Mark Hodnett

Mark Hodnett is a data scientist with over 20 years of industry experience in software development, business intelligence systems, and data science. He has worked in a variety of industries, including CRM systems, retail loyalty, IoT systems, and accountancy. He holds a master's in data science and an MBA. He works in Cork, Ireland, as a senior data scientist with AltViz.
Read more about Mark Hodnett

author image
Joshua F. Wiley

Joshua F. Wiley is a lecturer at Monash University, conducting quantitative research on sleep, stress, and health. He earned his Ph.D. from the University of California, Los Angeles and completed postdoctoral training in primary care and prevention. In statistics and data science, Joshua focuses on biostatistics and is interested in reproducible research and graphical displays of data and statistical models. He develops or co-develops a number of R packages including Varian, a package to conduct Bayesian scale-location structural equation models, and MplusAutomation, a popular package that links R to the commercial Mplus software.
Read more about Joshua F. Wiley