You're reading from Practical Deep Learning at Scale with MLflow

Product typeBook

Published inJul 2022

PublisherPackt

ISBN-139781803241333

Edition1st Edition

Concepts

Deep Learning

Author (1)

Yong Liu

Chapter 10: Implementing DL Explainability with MLflow

The importance of deep learning (DL) explainability is now well established, as we learned in the previous chapter. In order to implement DL explainability in a real-world project, it is desirable to log the explainer and the explanations as artifacts, just like other model artifacts in the MLflow server, so that we can easily track and reproduce the explanation. The integration of DL explainability tools such as SHAP (https://github.com/slundberg/shap) with MLflow can support different implementation mechanisms, and it is important to understand how these integrations can be used for our DL explainability scenarios. In this chapter, we will explore several ways to integrate the SHAP explanations into MLflow by using different MLflow capabilities. As tools for explainability and DL models are both rapidly evolving, we will also highlight the current limitations and workarounds when using MLflow for DL explainability implementation...

Technical requirements

The following requirements are necessary to complete this chapter:

MLflow full-fledged local server: This is the same one we have been using since Chapter 3, Tracking Models, Parameters, and Metrics.
The SHAP Python library: https://github.com/slundberg/shap.
Spark 3.2.1 and PySpark 3.2.1: See the details in the README.md file of this chapter's GitHub repository.
Code from the GitHub repository for this chapter: https://github.com/PacktPublishing/Practical-Deep-Learning-at-Scale-with-MLFlow/tree/main/chapter10.

Understanding current MLflow explainability integration

MLflow has several ways to support explainability integration. When implementing explainability, we refer to two types of artifacts: explainers and explanations:

An explainer is an explainability model, and a common one is a SHAP model that could be different kinds of SHAP explainers, such as TreeExplainer, KernelExplainer, and PartitionExplainer (https://shap.readthedocs.io/en/latest/generated/shap.explainers.Partition.html). For computational efficiency, we usually choose PartitionExplainer for DL models.
An explanation is an artifact that shows some form of output from the explainer, which could be text, numerical values, or plots. Explanations can happen in offline training or testing, or can happen during online production. Thus, we should be able to provide an explainer for offline evaluation or an explainer endpoint for online queries if we want to know why the model provides certain predictions.

Here...

Implementing a SHAP explanation using the MLflow artifact logging API

MLflow has a generic tracking API that can log any artifact: mlflow.log_artifact. However, the examples given in the MLflow documentation usually use scikit-learn and tabular numerical data for training, testing, and explaining. Here, we want to show how to use mlflow.log_artifact for an NLP sentimental DL model to log relevant artifacts, such as Shapley value arrays and Shapley value bar plots. You can check out the Python VS Code notebook, shap_mlflow_log_artifact.py, in this chapter's GitHub repository (https://github.com/PacktPublishing/Practical-Deep-Learning-at-Scale-with-MLFlow/blob/main/chapter10/notebooks/shap_mlflow_log_artifact.py) to follow along with the steps:

Make sure you have the prerequisites, including a local full-fledged MLflow server and the conda virtual environment, ready. Follow the instructions in the README.md (https://github.com/PacktPublishing/Practical-Deep-Learning-at-Scale...

Implementing a SHAP explainer using the MLflow pyfunc API

As we know from the previous section, a SHAP explainer can be used offline whenever needed by creating a new instance of an explainer using SHAP APIs. However, as the underlying DL models are often logged into the MLflow server, it is desirable to also log the corresponding explainer into the MLflow server, so that we not only keep track of the DL models, but also their explainers. In addition, we can use the generic MLflow pyfunc model logging and loading APIs for the explainer, thus unifying access to DL models and their explainers.

In this section, we will learn step-by-step how to implement a SHAP explainer as a generic MLflow pyfunc model and how to use it for offline and online explanation. We will break the process up into three subsections:

Creating and logging an MLflow pyfunc explainer
Deploying an MLflow pyfunc explainer for an EaaS
Using an MLflow pyfunc explainer for batching explanation

...

Summary

In this chapter, we first reviewed the existing approaches in the MLflow APIs that could be used for implementing explainability. Two existing MLflow APIs, mlflow.shap and mlflow.evaluate, have limitations, thus cannot be used for the complex DL models and pipelines explainability scenarios we need. We then focused on two main approaches to implement SHAP explanations and explainers within the MLflow API framework: mlflow.log_artifact for logging explanations and mlflow.pyfunc.PythonModel for logging a SHAP explainer. Using the log_artifact API can allow us to log Shapley values and explanation plots into the MLflow tracking server. Using mlflow.pyfunc.PythonModel allows us to log a SHAP explainer as a MLflow pyfunc model, thus opening doors to deploy a SHAP explainer as a web service to create an EaaS endpoint. It also opens doors to use SHAP explainers through the MLflow pyfunc load_model or spark_udf API for large-scale offline batch explanation. This enables us to confidently...

Shapley Values at Scale: https://neowaylabs.github.io/data-science/shapley-values-at-scale/
Scaling SHAP Calculations With PySpark and Pandas UDF: https://databricks.com/blog/2022/02/02/scaling-shap-calculations-with-pyspark-and-pandas-udf.html
Speeding up Shapley value computation using Ray, a distributed computing system: https://www.telesens.co/2020/10/05/speeding-up-shapley-value-computation-using-ray-a-distributed-computing-system/
Interpreting an NLP model with LIME and SHAP: https://medium.com/@kalia_65609/interpreting-an-nlp-model-with-lime-and-shap-834ccfa124e4
Model Evaluation in MLflow: https://databricks.com/blog/2022/04/19/model-evaluation-in-mlflow.html

The rest of the chapter is locked

You have been reading a chapter from

Practical Deep Learning at Scale with MLflow

Published in: Jul 2022Publisher: PacktISBN-13: 9781803241333

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Yong Liu

Yong Liu has been working in big data science, machine learning, and optimization since his doctoral student years at the University of Illinois at Urbana-Champaign (UIUC) and later as a senior research scientist and principal investigator at the National Center for Supercomputing Applications (NCSA), where he led data science R&D projects funded by the National Science Foundation and Microsoft Research. He then joined Microsoft and AI/ML start-ups in the industry. He has shipped ML and DL models to production and has been a speaker at the Spark/Data+AI summit and NLP summit. He has recently published peer-reviewed papers on deep learning, linked data, and knowledge-infused learning at various ACM/IEEE conferences and journals.
Read more about Yong Liu

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages

You're reading from Practical Deep Learning at Scale with MLflow

Chapter 10: Implementing DL Explainability with MLflow

Technical requirements

Understanding current MLflow explainability integration

Implementing a SHAP explanation using the MLflow artifact logging API

Implementing a SHAP explainer using the MLflow pyfunc API

Summary

Further reading

Why subscribe?

Unlock this book and the full library FREE for 7 days

Author (1)

Et al.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Mastering Tableau 2023

Building AI Applications with ChatGPT APIs

Building AI Applications with ChatGPT APIs

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

Modern Data Architecture on AWS

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

TinyML Cookbook