You're reading from Building Data Science Solutions with Anaconda

Product typeBook

Published inMay 2022

PublisherPackt

ISBN-139781800568785

Edition1st Edition

Tools

Anaconda

Concepts

Data Science

Author (1)

Dan Meador

Chapter 10: Explainable AI - Using LIME and SHAP

Let's play out a quick scenario. You are sitting in the doctor's office after having just received your annual physical. The doctor looks over your results and then casually mentions that you have a 95% chance of having a heart attack in the next month. What is your next question? "Why?" you ask. "I'm not sure," replies the doctor.

What would be your reaction to the doctor's answer in this situation? Is that good enough, or would you like a bit more information? If you are like most people, you might want to know if there was anything you could do to prevent it, or maybe there was a mix-up with your blood work and you have nothing to fear.

The use of AI models is just going to increase in the years to come with AutoML and other low and no-code options that allow those with less technical expertise to create models. You see a lot of models, but you need to be able to explain how things were...

Technical requirements

There are just a few things that we'll need in order to go through this chapter.

The first is that the Anaconda distribution is installed. As we know, this includes Python, conda, Navigator, as well as many other packages used in data science.

We will use the following packages:

SHAP
LIME
pandas
NumPy
Scikit-learn

You should create a new conda environment to install all of these packages. It's recommended to install them all at the beginning, but you can also do so as you get to the relevant part of the chapter.

After you have these things in place, we can look at why we should care about the interpretation of models in the first place.

Understanding the value of interpretation

Explainable AI will be essential if users are to understand, appropriately trust, and effectively manage this incoming generation of artificially intelligent partners.

This was the outlook of DARPA in their Explainable Artificial Intelligence (XAI) report in 2016 (https://bit.ly/3JU9yql).

Whether you agree or not with AI being used in military endeavors, it is being used in this field. Being able to know why certain outcomes are achieved is critical in this space and many others. AI isn't used just in the more traditional role the military plays, but also when trying to discern the third- and fourth-order impacts on the military supply chain when resources such as planes are moved from one military base to another.

Knowing the difference between interpreting and explaining

There is a lot of value in knowing what features are giving you the results you get, and there is even more in being able to explain why it matters. One...

Understanding models that are interpretable by design

In Chapter 7, Choosing the Best AI Algorithm, we mentioned that more complex algorithm types, such as neural networks, are often used even though they provide very little benefit. You should favor keeping it simple as much as possible, following the KISS principle (keep it simple, stupid). Not only may other models be simpler and easier to interpret, but they provide some fantastic results as well. Simple doesn't mean inferior.

We have looked at many models in this book that come with the ability to understand how the results were achieved without any special techniques. The algorithms we will cover now are as follows:

Decision trees
Linear/logistic regression
KNN

We'll use a medical example, as mentioned earlier in the chapter. This dataset is a binary classifier for whether someone is at risk of heart disease or not. For this chapter, we'll keep the data preparation and other steps out...

Explaining a model's outcome with LIME

Now we are moving on to black box models. They are becoming much more common due to the efficacy they have shown in popular areas of the domain, such as NLP, vision problems, and various other areas where vast amounts of data being fed in produce amazing results. These domains aren't going anywhere, and so we need to find a way to interpret these models after the fact using post-hoc interpretability.

The first approach that we'll look at is Local Interpretable Model-Agnostic Explanations (LIME), which assumes that if you zoom in on even a complex nonlinear relationship, you will find a linear one at the local level. It then will try to learn this local linear relationship by creating synthetic records that are like the record we care about. By creating these points/records that have slightly altered inputs, it can figure out the impact that each feature has based on the model's output. As the name suggests, its model agnostic...

Explaining a model's outcome with SHAP

Not long after LIME came out, another tool was introduced to help with interpreting AI models, SHapley Additive exPlanations (SHAP). The main function of SHAP is that if you take permutations of the different input features, you can determine how important each feature is to the outcome. This might sound like LIME, and that's because it took inspiration from it while also using some new concepts. There are, however, key differences, which we'll explain.

Avoid confusion with Shapley values

This approach is based on Shapley values, but is not the same thing. SHAP is an iteration with Shapley game theory at its base. There are various forms of SHAP, such as kernel and TreeSHAP. Also, there are global interpretations that SHAP allows, which again are building onto what Shapley values allows.

Let's look at an example to make it a bit clearer how SHAP is used.

Let's say you are a player in a two-on-two basketball...

Summary

Throughout this chapter, you gained insights into how interpretability and explainability fit into the picture of a healthy model and a robust data science workflow. We saw how they are important not just for creating a great model, but also for business, moral, and legal reasons.

We checked back into the algorithms from earlier chapters, such as decision trees, and saw that they have a great advantage not only in accuracy but also in their ability to be interpreted by the data scientists creating them.

Later, we saw how even despite the suggestion that simpler models should be considered first, black box models are quite common, so we should still be able to interpret models such as random forests. With that in mind, you saw how LIME can be a great tool to turn that black box into a more transparent version of itself by assuming that linear relationships can be found when zooming in on the global space.

Finally, we checked out SHAP, which builds on Shapley values...

The rest of the chapter is locked

You have been reading a chapter from

Building Data Science Solutions with Anaconda

Published in: May 2022Publisher: PacktISBN-13: 9781800568785

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Dan Meador

Dan Meador is an Engineering Manager at Anaconda and is the creator of Conda as well as a champion of open source at Anaconda. With a history of engineering and client facing roles, he has the ability to jump into any position. He has a track record of delivering as a leader and a follower in companies from the Fortune 10 to startups.
Read more about Dan Meador

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages