You're reading from Interpretable Machine Learning with Python - Second Edition

Product typeBook

Published inOct 2023

PublisherPackt

ISBN-139781803235424

Edition2nd Edition

Concepts

Machine Learning

Author (1)

Serg Masís

Interpretation Methods for Multivariate Forecasting and Sensitivity Analysis

Throughout this book, we have learned about various methods we can use to interpret supervised learning models. They can be quite effective at assessing models while also uncovering their most influential predictors and their hidden interactions. But as the term supervised learning suggests, these methods can only leverage known samples and permutations based on these known samples’ distributions. However, when these samples represent the past, things can get tricky! As the Nobel laureate in physics Niels Bohr famously quipped, “Prediction is very difficult, especially if it’s about the future.”

Indeed, when you see data points fluctuating in a time series, they may appear to be rhythmically dancing in a predictable pattern – at least in the best-case scenarios. Like a dancer moving to a beat, every repetitive movement (or frequency) can be attributed to seasonal patterns...

Join our book community on Discord

https://packt.link/EarlyAccessCommunity

Over the last thirteen chapters, we have explored the field of Machine Learning (ML) interpretability. As stated in the preface, it's a broad area of research, most of which hasn't even left the lab and become widely used yet, and this book has no intention of covering absolutely all of it. Instead, the objective is to present various interpretability tools in sufficient depth to be useful as a starting point for beginners and even complement the knowledge of more advanced readers. This chapter will summarize what we've learned in the context of the ecosystem of ML interpretability methods, and then speculate on what's to come next!

These are the main topics we are going to cover in this chapter:

Understanding the current landscape of ML interpretability
Speculating on the future of ML interpretability

Understanding the current landscape of ML interpretability

First, we will provide some context on how the book relates to the main goals of ML interpretability and how practitioners can start applying the methods to achieve those broad goals. Then, we'll discuss what the current areas of growth in research are.

Tying everything together!

As discussed in Chapter 1, Interpretation, Interpretability, and Explainability; and Why Does It All Matter?, there are three main themes when talking about ML interpretability: Fairness, Accountability, and Transparency (FAT), and each of these presents a series of concerns (see Figure 14.1). I think we can all agree these are all desirable properties for a model! Indeed, these concerns all present opportunities for the improvement of Artificial Intelligence (AI) systems. These improvements start by leveraging model interpretation methods to evaluate models, confirm or dispute assumptions, and find problems.

What your aim is will depend on what...

Speculating on the future of ML interpretability

I'm used to hearing the metaphor of this period being the "Wild West of AI", or worse, an "AI Gold Rush"! It conjures images of unexplored and untamed territory being eagerly conquered, or worse, civilized. Yet, in the 19th century, the United States' western areas were not too different from other regions on the planet and had already been inhabited by Native Americans for millennia, so the metaphor doesn't quite work. Predicting with the accuracy and confidence that we can achieve with ML would spook our ancestors and is not a "natural" position for us humans. It's more akin to flying than exploring unknown land.

The article Toward the Jet Age of machine learning (linked in the Further reading section at the end of this chapter) presents a much more fitting metaphor of AI being like the dawn of aviation. It's new and exciting, and people still marvel at what we can do from down below...

Assessing time series models with traditional interpretation methods

A time series regressor model can be evaluated as you would evaluate any regression model; that is, using metrics derived from the mean squared error or the R-squared score. There are, of course, cases in which you will need to use a metric with medians, logs, deviances, or absolute values. These models don’t require any of this.

Using standard regression metrics

The evaluate_reg_mdl function can evaluate the model, output some standard regression metrics, and plot them. The parameters for this model are the fitted model (lstm_traffic_mdl), X_train (gen_train), X_test (gen_test), y_train, and y_test.

Optionally, we can specify a y_scaler so that the model is evaluated with the labels’ inverse transformed, which makes the plot and root mean square error (RMSE) much easier to interpret. Another optional parameter that is very much necessary, in this case, is y_truncate=True because our y_train...

Generating LSTM attributions with integrated gradients

We first learned about integrated gradients (IG) in Chapter 7, Visualizing Convolutional Neural Networks. Unlike the other gradient-based attribution methods studied in that chapter, path-integrated gradients is not contingent on convolutional layers, nor is it limited to classification problems.

In fact, since it computes the gradients of the output concerning the inputs averaged along the path, the input and output could be anything! It is common to use integrated gradients with Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), like the one we are interpreting in this chapter. Frankly, when you see an IG LSTM example online, it has an embedding layer and is an NLP classifier, but IG could be used very effectively for LSTMs that even process sounds or genetic data!

The integrated gradient explainer and the explainers that we will use moving forward can access any part of the traffic dataset....

Computing global and local attributions with SHAP’s KernelExplainer

Permutation methods make changes to the input to assess how much difference they will make to a model’s output. We first discussed this in Chapter 4, Global Model-Agnostic interpretation methods, but if you recall, there’s a coalitional framework to perform these permutations that will produce the average marginal contribution for each feature across different coalitions of features. This process’s outcome is Shapley values, which have essential mathematical properties such as additivity and symmetry. Unfortunately, Shapley values are costly to compute for datasets that aren’t small, so the SHAP library has approximation methods. One of these methods is KernelExplainer, which we also explained in Chapter 4 and used in Chapter 5, Local Model-Agnostic Interpretation Methods. It approximates the Shapley values with a weighted local linear regression, just like LIME does.

Why use...

Identifying influential features with factor prioritization

The Morris method is one of several global sensitivity analysis methods that range from simple Fractional factorial to complicated Monte Carlo filtering. Morris is somewhere on this spectrum, falling into two categories. It uses one-at-a-time sampling, which means that only one value changes between consecutive simulations. It’s also an Elementary Effects (EE) method, which means that it doesn’t quantify the exact effect of a factor in a model but rather gauges its importance and relationship with other factors. By the way, factor is just another word for a feature or variable that’s commonly used in applied statistics. To be consistent with the related theory, we will use this word in this and the next section.

Another property of Morris is that it’s less computationally expensive than the variance-based methods we will study next. It can provide more insights than simpler and less costly...

Quantifying uncertainty and cost sensitivity with factor fixing

With the Morris indices, it became evident that all the factors are non-linear or non-monotonic. There’s a high degree of interactivity between them – as expected! It should be no surprise that climate factors (temp, rain_1h, snow_1h, and cloud_coverage) are likely multicollinear with hr. There are also patterns to be found between hr, is_holiday, and dow and the target. Many of these factors most definitely don’t have a monotonic relationship with the target. We know this already. For instance, traffic doesn’t consistently increase as hours increase throughout the day. That’s not the case for days of the week either!

However, we didn’t know to what degree is_holiday and temp impacted the model, particularly during the crew’s working hours, which was an important insight. That being said, factor prioritization with Morris indices is usually to be taken as a starting...

Mission accomplished

The mission was to train a traffic prediction model and understand what factors create uncertainty and possibly increase costs for the construction company. We can conclude a significant portion of the potential $35,000/year in fines can be attributed to the is_holiday factor. Therefore, the construction company should rethink working holidays. There are only seven or eight holidays between March and November, and they could cost more because of the fines than working on a few Sundays instead. With this caveat, the mission was successful, but there’s still a lot of room for improvement.

Of course, these conclusions are for the LSTM_traffic_168_compact1 model – which we can compare with other models. Try replacing the model_name at the beginning of the notebook with LSTM_traffic_168_compact2, an equally small but significantly more robust model, or LSTM_traffic_168_optimal, a larger slightly better-performing model, and re-running the notebook...

Summary

After reading this chapter, you should understand how to assess a time series model’s predictive performance, know how to perform local interpretations for them with integrated gradients, and know how to produce both local and global attributions with SHAP. You should also know how to leverage sensitivity analysis factor prioritization and factor fixing for any model.

In the next chapter, we will learn how to reduce the complexity of a model and make it more interpretable with feature selection and engineering.

Dataset and image sources

TomTom, 2019, Traffic Index: https://nonews.co/wp-content/uploads/2020/02/TomTom2019.pdf
UCI Machine Learning Repository, 2019, Metro Interstate Traffic Volume Data Set: https://archive.ics.uci.edu/ml/datasets/Metro+Interstate+Traffic+Volume

Wilson, D.R., and Martinez, T., 1997, Improved Heterogeneous Distance Functions. J. Artif. Int. Res. 6-1. pp.1-34: https://arxiv.org/abs/cs/9701101
Morris, M., 1991, Factorial sampling plans for preliminary computational experiments. Quality Engineering, 37, 307-310: https://doi.org/10.2307%2F1269043
Saltelli, A., Tarantola, S., Campolongo, F., and Ratto, M., 2007, Sensitivity analysis in practice: A guide to assessing scientific models. Chichester: John Wiley & Sons.
Sobol, I.M., 2001, Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates. MATH COMPUT SIMULAT,55(1–3),271-280: https://doi.org/10.1016/S0378-4754(00)00270-6
Saltelli, A., P. Annoni, I. Azzini, F. Campolongo, M. Ratto, and S. Tarantola, 2010, Variance based sensitivity analysis of model output. Design and estimator for the total sensitivity index. Computer Physics Communications, 181(2):259-270: https://doi.org/10.1016/j.cpc...

The rest of the chapter is locked

You have been reading a chapter from

Interpretable Machine Learning with Python - Second Edition

Published in: Oct 2023Publisher: PacktISBN-13: 9781803235424

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Serg Masís

Serg Masís has been at the confluence of the internet, application development, and analytics for the last two decades. Currently, he's a climate and agronomic data scientist at Syngenta, a leading agribusiness company with a mission to improve global food security. Before that role, he co-founded a start-up, incubated by Harvard Innovation Labs, that combined the power of cloud computing and machine learning with principles in decision-making science to expose users to new places and events. Whether it pertains to leisure activities, plant diseases, or customer lifetime value, Serg is passionate about providing the often-missing link between data and decision-making—and machine learning interpretation helps bridge this gap robustly.
Read more about Serg Masís

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages