Reader small image

You're reading from  AWS Certified Machine Learning Specialty: MLS-C01 Certification Guide

Product typeBook
Published inMar 2021
Reading LevelBeginner
PublisherPackt
ISBN-139781800569003
Edition1st Edition
Languages
Right arrow
Authors (2):
Somanath Nanda
Somanath Nanda
author image
Somanath Nanda

Somanath has 10 years of working experience in IT industry which includes Prod development, Devops, Design and architect products from end to end. He has also worked at AWS as a Big Data Engineer for about 2 years.
Read more about Somanath Nanda

Weslley Moura
Weslley Moura
author image
Weslley Moura

Weslley Moura has been developing data products for the past decade. At his recent roles, he has been influencing data strategy and leading data teams into the urban logistics and blockchain industries.
Read more about Weslley Moura

View More author details
Right arrow

Chapter 8: Evaluating and Optimizing Models

It is now time to learn how to evaluate and optimize machine learning models. During the process of modeling, or even after model completion, you might want to understand how your model is performing. Each type of model has its own set of metrics that can be used to evaluate performance, and that is what we are going to study in this chapter.

Apart from model evaluation, as a data scientist, you might also need to improve your model's performance by tuning the hyperparameters of your algorithm. We will take a look at some nuances of this modeling task.

In this chapter, we will cover the following topics:

  • Introducing model evaluation
  • Evaluating classification models
  • Evaluating regression models
  • Model optimization

Alright, let's do it!

Introducing model evaluation

There are several different scenarios in which we might want to evaluate model performance, some of them are as follows.

  • You are creating a model and testing different approaches and/or algorithms. Therefore, you need to compare these models to select the best one.
  • You have just completed your model and you need to document your work, which includes specifying the model's performance metrics that you have reached out to during the modeling phase.
  • Your model is running in a production environment and you need to track its performance. If you encounter model drift, then you might want to retrain the model.

    Important note

    The term model drift is used to refer to the problem of model deterioration. When you are building a machine learning model, you must use data to train the algorithm. This set of data is known as training data, and it reflects the business rules at a particular point in time. If these business rules change over time, your...

Evaluating classification models

Classification models are one of the most traditional classes of problems that you might face, either during the exam or during your journey as a data scientist. A very important artifact that you might want to generate during the classification model evaluation is known as a confusion matrix.

A confusion matrix compares your model predictions against the real values of each class under evaluation. Figure 8.1 shows what a confusion matrix looks like in a binary classification problem:

Figure 8.1 – A confusion matrix

We find the following components in a confusion matrix:

  • TP: This is the number of True Positive cases. Here, we have to count the number of cases that have been predicted as true and are, indeed, true. For example, in a fraud detection system, this would be the number of fraudulent transactions that were correctly predicted as fraud.
  • TN: This is the number of True Negative cases. Here, we have...

Evaluating regression models

Regression models are quite different from classification models since the outcome of the model is a continuous number. Therefore, the metrics around regression models aim to monitor the difference between real and predicted values.

The simplest way to check the difference between a predicted value (yhat) and its actual value (y) is by performing a simple subtraction operation, where the error will be equal to the absolute value of yhat – y. This metric is known as the Mean Absolute Error (MAE).

Since we usually have to evaluate the error of each prediction, i, we have to take the mean value of the errors. The following formula shows how this error can be formally defined:

Sometimes, you might want to penalize bigger errors over smaller errors. To achieve this, you can use another metric, which is known as the Mean Squared Error (MSE). MSE will square each error and return the mean value.

By squaring errors,...

Model optimization

As you know, understanding evaluation metrics is very important in order to measure your model's performance and document your work. In the same way, when we want to optimize our current models, evaluating metrics also plays a very important role in defining the baseline performance that we want to challenge.

The process of model optimization consists of finding the best configuration (also known as hyperparameters) of the machine learning algorithm for a particular data distribution. We don't want to find hyperparameters that overfit the training data in the same way that we don't want to find hyperparameters that underfit the training data.

You learned about overfitting and underfitting in Chapter 1, Machine Learning Fundamentals. In the same chapter, you also learned how to avoid these two types of modeling issues.

In this section, we will learn about some techniques that you can use to find the best configuration for a particular algorithm...

Summary

In this chapter, you learned about the main metrics for model evaluation. We first started with the metrics for classification problems and then we moved on to the metrics for regression problems.

In terms of classification metrics, you have been introduced to the well-known confusion matrix, which is probably the most important artifact to perform a model evaluation on classification models.

Aside from knowing what true positive, true negative, false positive, and false negative are, we have learned how to combine these components to extract other metrics, such as accuracy, precision, recall, the F1 score, and AUC.

We went even deeper and learned about ROC curves, as well as precision-recall curves. We learned that we can use ROC curves to evaluate fairly balanced datasets and precision-recall curves for moderate to imbalanced datasets.

By the way, when you are dealing with imbalanced datasets, remember that using accuracy might not be a good idea.

In terms...

Questions

  1. You are working as a data scientist for a pharmaceutical company. You are collaborating with other teammates to create a machine learning model to classify certain types of diseases on image exams. The company wants to prioritize the assertiveness rate of positive cases, even if they have to wrongly return false negatives. Which type of metric would you use to optimize the underlying model?

    a. Recall

    b. Precision

    c. R-squared

    d. RMSE

    Answer

    In this scenario, the company prefers to have a higher probability to be right on positive outcomes at the cost of wrongly classifying some positive cases as negative. Technically, they prefer to increase precision at the cost of reducing recall.

  2. You are working as a data scientist for a pharmaceutical company. You are collaborating with other teammates to create a machine learning model to classify certain types of diseases on image exams. The company wants to prioritize the capture of positive cases, even if they have to wrongly return...
lock icon
The rest of the chapter is locked
You have been reading a chapter from
AWS Certified Machine Learning Specialty: MLS-C01 Certification Guide
Published in: Mar 2021Publisher: PacktISBN-13: 9781800569003
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Somanath Nanda

Somanath has 10 years of working experience in IT industry which includes Prod development, Devops, Design and architect products from end to end. He has also worked at AWS as a Big Data Engineer for about 2 years.
Read more about Somanath Nanda

author image
Weslley Moura

Weslley Moura has been developing data products for the past decade. At his recent roles, he has been influencing data strategy and leading data teams into the urban logistics and blockchain industries.
Read more about Weslley Moura