Reader small image

You're reading from  Practical Deep Learning at Scale with MLflow

Product typeBook
Published inJul 2022
PublisherPackt
ISBN-139781803241333
Edition1st Edition
Right arrow
Author (1)
Yong Liu
Yong Liu
author image
Yong Liu

Yong Liu has been working in big data science, machine learning, and optimization since his doctoral student years at the University of Illinois at Urbana-Champaign (UIUC) and later as a senior research scientist and principal investigator at the National Center for Supercomputing Applications (NCSA), where he led data science R&D projects funded by the National Science Foundation and Microsoft Research. He then joined Microsoft and AI/ML start-ups in the industry. He has shipped ML and DL models to production and has been a speaker at the Spark/Data+AI summit and NLP summit. He has recently published peer-reviewed papers on deep learning, linked data, and knowledge-infused learning at various ACM/IEEE conferences and journals.
Read more about Yong Liu

Right arrow

Chapter 10: Implementing DL Explainability with MLflow

The importance of deep learning (DL) explainability is now well established, as we learned in the previous chapter. In order to implement DL explainability in a real-world project, it is desirable to log the explainer and the explanations as artifacts, just like other model artifacts in the MLflow server, so that we can easily track and reproduce the explanation. The integration of DL explainability tools such as SHAP (https://github.com/slundberg/shap) with MLflow can support different implementation mechanisms, and it is important to understand how these integrations can be used for our DL explainability scenarios. In this chapter, we will explore several ways to integrate the SHAP explanations into MLflow by using different MLflow capabilities. As tools for explainability and DL models are both rapidly evolving, we will also highlight the current limitations and workarounds when using MLflow for DL explainability implementation...

Technical requirements

The following requirements are necessary to complete this chapter:

Understanding current MLflow explainability integration

MLflow has several ways to support explainability integration. When implementing explainability, we refer to two types of artifacts: explainers and explanations:

  • An explainer is an explainability model, and a common one is a SHAP model that could be different kinds of SHAP explainers, such as TreeExplainer, KernelExplainer, and PartitionExplainer (https://shap.readthedocs.io/en/latest/generated/shap.explainers.Partition.html). For computational efficiency, we usually choose PartitionExplainer for DL models.
  • An explanation is an artifact that shows some form of output from the explainer, which could be text, numerical values, or plots. Explanations can happen in offline training or testing, or can happen during online production. Thus, we should be able to provide an explainer for offline evaluation or an explainer endpoint for online queries if we want to know why the model provides certain predictions.

Here...

Implementing a SHAP explanation using the MLflow artifact logging API

MLflow has a generic tracking API that can log any artifact: mlflow.log_artifact. However, the examples given in the MLflow documentation usually use scikit-learn and tabular numerical data for training, testing, and explaining. Here, we want to show how to use mlflow.log_artifact for an NLP sentimental DL model to log relevant artifacts, such as Shapley value arrays and Shapley value bar plots. You can check out the Python VS Code notebook, shap_mlflow_log_artifact.py, in this chapter's GitHub repository (https://github.com/PacktPublishing/Practical-Deep-Learning-at-Scale-with-MLFlow/blob/main/chapter10/notebooks/shap_mlflow_log_artifact.py) to follow along with the steps:

  1. Make sure you have the prerequisites, including a local full-fledged MLflow server and the conda virtual environment, ready. Follow the instructions in the README.md (https://github.com/PacktPublishing/Practical-Deep-Learning-at-Scale...

Implementing a SHAP explainer using the MLflow pyfunc API

As we know from the previous section, a SHAP explainer can be used offline whenever needed by creating a new instance of an explainer using SHAP APIs. However, as the underlying DL models are often logged into the MLflow server, it is desirable to also log the corresponding explainer into the MLflow server, so that we not only keep track of the DL models, but also their explainers. In addition, we can use the generic MLflow pyfunc model logging and loading APIs for the explainer, thus unifying access to DL models and their explainers.

In this section, we will learn step-by-step how to implement a SHAP explainer as a generic MLflow pyfunc model and how to use it for offline and online explanation. We will break the process up into three subsections:

  • Creating and logging an MLflow pyfunc explainer
  • Deploying an MLflow pyfunc explainer for an EaaS
  • Using an MLflow pyfunc explainer for batching explanation
...

Summary

In this chapter, we first reviewed the existing approaches in the MLflow APIs that could be used for implementing explainability. Two existing MLflow APIs, mlflow.shap and mlflow.evaluate, have limitations, thus cannot be used for the complex DL models and pipelines explainability scenarios we need. We then focused on two main approaches to implement SHAP explanations and explainers within the MLflow API framework: mlflow.log_artifact for logging explanations and mlflow.pyfunc.PythonModel for logging a SHAP explainer. Using the log_artifact API can allow us to log Shapley values and explanation plots into the MLflow tracking server. Using mlflow.pyfunc.PythonModel allows us to log a SHAP explainer as a MLflow pyfunc model, thus opening doors to deploy a SHAP explainer as a web service to create an EaaS endpoint. It also opens doors to use SHAP explainers through the MLflow pyfunc load_model or spark_udf API for large-scale offline batch explanation. This enables us to confidently...

Why subscribe?

  • Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals
  • Improve your learning with Skill Plans built especially for you
  • Get a free eBook or video every month
  • Fully searchable for easy access to vital information
  • Copy and paste, print, and bookmark content

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at customercare@packtpub.com for more details.

At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Practical Deep Learning at Scale with MLflow
Published in: Jul 2022Publisher: PacktISBN-13: 9781803241333
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Yong Liu

Yong Liu has been working in big data science, machine learning, and optimization since his doctoral student years at the University of Illinois at Urbana-Champaign (UIUC) and later as a senior research scientist and principal investigator at the National Center for Supercomputing Applications (NCSA), where he led data science R&D projects funded by the National Science Foundation and Microsoft Research. He then joined Microsoft and AI/ML start-ups in the industry. He has shipped ML and DL models to production and has been a speaker at the Spark/Data+AI summit and NLP summit. He has recently published peer-reviewed papers on deep learning, linked data, and knowledge-infused learning at various ACM/IEEE conferences and journals.
Read more about Yong Liu