You're reading from Data Science Projects with Python - Second Edition

Product type Book

Published in Jul 2021

Publisher Packt

ISBN-13 9781800564480

Pages 432 pages

Edition 2nd Edition

Languages

Python

Concepts

Data Science

Author (1):

Stephen Klosterman

Table of Contents (9) Chapters

Preface

1. Data Exploration and Cleaning

2. Introduction to Scikit-Learn and Model Evaluation

3. Details of Logistic Regression and Feature Exploration

4. The Bias-Variance Trade-Off

5. Decision Trees and Random Forests

6. Gradient Boosting, XGBoost, and SHAP Values

7. Test Set Analysis, Financial Insights, and Delivery to the Client

Appendix

7. Test Set Analysis, Financial Insights, and Delivery to the Client

Overview

This chapter presents several techniques for analyzing a model test set for deriving insights into likely model performance in the future. These techniques include the same model performance metrics we've already calculated, such as the ROC AUC, as well as new kinds of visualizations, such as the sloping of default risk by bins of predicted probability and the calibration of predicted probability. After reading this chapter, you will be able to bridge the gap between the theoretical metrics of machine learning and the financial metrics of the business world. You will be able to identify key insights while estimating the financial impact of a model and provide guidance to the client on how to realize this impact. We close with a discussion of the key elements to consider when delivering and deploying a model, such as the format of delivery and ways to monitor the model as it is being used.

...

Introduction

In the previous chapter, we used XGBoost to push model performance even higher than all our previous efforts and learned how to explain model predictions using SHAP values. Now, we will consider model building to be complete and address the remaining issues that need attention before delivering the model to the client. The key elements of this chapter are analysis of the test set, including financial analysis, and things to consider when delivering a model to a client who wants to use it in the real world.

We look at the test set to get an idea of how well the model will perform in the future. By calculating metrics we already know, like the ROC AUC, but now on the test set, we can gain confidence that our model will be useful for new data. We'll also learn some intuitive ways to visualize the power of the model for grouping customers into different levels of risk of default, such as a decile chart.

Your client will likely appreciate the efforts you made in...

Review of Modeling Results

In order to develop a binary classification model to meet the business requirements of our client, we have now tried several modeling techniques with varying degrees of success. In the end, we'd like to choose the model with the best performance to do further analyses on and present to our client. However, it is also good to communicate the other options we explored, demonstrating a thoroughly researched project.

Here, we review the different models that we tried for the case study problem, the hyperparameters that we needed to tune, and the results from cross-validation, or the validation set in the case of XGBoost. We only include the work we did using all possible features, not the earlier exploratory models where we used only one or two features:

Figure 7.1: Summary of modeling activities with case study data

When presenting results to the client, you should be prepared to interpret them for business partners...

Model Performance on the Test Set

We already have some idea of the out-of-sample performance of the XGBoost model, from the validation set. However, the validation set was used in model fitting, via early stopping. The most rigorous estimate of expected future performance we can make should be created with data that was not used at all for model fitting. This was the reason for reserving a test dataset from the model building process.

You may notice that we did examine the test set to some extent already, for example, in the first chapter when assessing data quality and doing data cleaning. The gold standard for predictive modeling is to set aside a test set at the very beginning of a project and not examine it at all until the model is finished. This is the easiest way to make sure that none of the knowledge from the test set has "leaked" into the training set during model development. When this happens, it opens up the possibility that the test set is no longer a realistic...

Financial Analysis

The model performance metrics we have calculated so far were based on abstract measures that could be applied to analyze any classification model: how accurate a model is, how skillful a model is at identifying true positives relative to false positives at different thresholds (ROC AUC), the correctness of positive predictions (precision), or intuitive measures such as sloping risk. These metrics are important for understanding the basic workings of a model and are widely used within the machine learning community, so it's important to understand them. However, for the application of a model to business use cases, we can't always directly use such performance metrics to create a strategy for how to use the model to guide business decisions or figure out how much value a model is expected to create. To go the extra mile and connect the mathematical world of predicted probabilities and thresholds to the business world of costs and benefits, a financial analysis...

Final Thoughts on Delivering a Predictive Model to the Client

We have now completed the modeling activities and also created a financial analysis to indicate to the client how they can use the model. While we have completed the essential intellectual contributions that are the data scientist's responsibility, it is necessary to agree with the client on the form in which all these contributions will be delivered.

A key contribution is the predictive capability embodied in the trained model. Assuming the client can work with the trained model object we created with XGBoost, this model could be saved to disk as we've done and sent to the client. Then, the client would be able to use it within their workflow. This pathway to model delivery may require the data scientist to work with engineers in the client's organization, to deploy the model within the client's infrastructure.

Alternatively, it may be necessary to express the model as a mathematical equation...

Summary

In this chapter, you learned several analysis techniques to provide insight into model performance, such as decile and equal-interval charts of default rate by model prediction bin, as well as how to investigate the quality of model calibration. It's good to derive these insights, as well as calculate metrics such as the ROC AUC, using the model test set, since this is intended to represent how the model might perform in the real world on new data.

We also saw how to go about conducting a financial analysis of model performance. While we left this to the end of the book, an understanding of the costs and savings going along with the decisions to be guided by the model should be understood from the beginning of a typical project. These allow the data scientist to work toward a tangible goal in terms of increased profit or savings. A key step in this process, for binary classification models, is to choose a threshold of predicted probability at which to declare a positive...