As more and more organizations have started to adopt Artificial Intelligence (AI) and Machine Learning (ML) for their critical business decision-making process, it becomes an immediate expectation to interpret and demystify black-box algorithms to increase their adoption. AI and ML are being increasingly utilized for determining our day-to-day experiences across multiple areas, such as banking, healthcare, education, recruitment, transport, and supply chain. But the integral role played by AI and ML models has led to the growing concern of business stakeholders and consumers about the lack of transparency and interpretability as these black-box algorithms are highly subjected to human bias; particularly for high-stake domains, such as healthcare, finance, legal, and other critical industrial operations, model explainability is a prerequisite.
As the benefits of AI and ML can be significant, the question is, how can we increase its adoption despite the growing concerns? Can we even address these concerns and democratize the use of AI and ML? And how can we make AI more explainable for critical industrial applications in which black-box models are not trusted? Throughout this book, we will try to learn the answers to these questions and apply these concepts and ideas to solve practical problems!
In this chapter, you will learn about the foundational concepts of Explainable AI (XAI) so that the terms and concepts used in future chapters are clear, and it will be easier to follow and implement some of the advanced explainability techniques discussed later in this book. This will give you the required theoretical knowledge needed to understand and implement the practical techniques discussed in later chapters. The chapter focuses on the following main topics:
- Introduction to XAI
- Defining explanation methods and approaches
- Evaluating the quality of explainability methods
Now, let's get started!
Introduction to XAI
XAI is the most effective practice to ensure that AI and ML solutions are transparent, trustworthy, responsible, and ethical so that all regulatory requirements on algorithmic transparency, risk mitigation, and a fallback plan are addressed efficiently. AI and ML explainability techniques provide the necessary visibility into how these algorithms operate at every stage of their solution life cycle, allowing end users to understand why and how queries are related to the outcome of AI and ML models.
Understanding the key terms
Usually, for ML models, for addressing the how questions, we use the term interpretability, and for addressing the why questions, we use the term explainability. In this book, the terms model interpretability and model explainability are interchangeably used. However, for providing human-friendly holistic explanations of the outcome of ML models, we will need to make ML algorithms both interpretable and explainable, thus allowing the end users to easily comprehend the decision-making process of these models.
In most scenarios, ML models are considered as black-boxes, where we feed in any training data and it is expected to predict on new, unseen data. Unlike conventional programming, where we program specific instructions, an ML model automatically tries to learn these instructions from the data. As illustrated in Figure 1.1, when we try to find out the rationale for the model prediction, we do not get enough information!
Now, let's understand the impact of incorrect predictions and inaccurate ML models.
Consequences of poor predictions
Traditionally, all ML models were believed to be magical black-boxes that can automatically decipher interesting patterns and insights from the data and provide silver bullet outcomes! As compared to conventional rule-based computer programs, which are limited by the intelligence of the programmer, well-trained ML algorithms are considered to provide rich insights and accurate predictions even in complex situations. But the fact is, all ML models suffer from bias, which can be due to the inductive bias of the algorithm itself, or it can be due to the presence of bias in the dataset used for training the model. In practice, there can be other reasons, such as data drift, concept drift, and overfitted or underfitted models, for which model predictions can go wrong. As the famous British statistician George E.P. Box once said, "All models are wrong, but some are useful"; all statistical, scientific, and ML models can give incorrect outcomes if the initial assumptions of these methods are not consistent. Therefore, it is important for us to know why an ML model predicted a specific outcome, what can be done if it is wrong, and how the predictions can be improved.
Figure 1.2 illustrates a collection of news headlines highlighting the failure of AI algorithms towards producing fair and unbiased outcomes.
Before completely agreeing with me on the necessity of model explainability, let me try to give some practical examples of low-stake and high-stake domains to understand the consequences of poor predictions. Weather forecasting is one of the classical forecasting problems that is extremely challenging (as it depends on multiple dynamic factors) where ML is extensively used, and the ability of ML algorithms to consider multiple parameters of different types makes it more efficient than standard statistical models to predict the weather. Despite having highly accurate forecast models, there are times when weather forecasting algorithms might miss the prediction of rainfall, even though it starts raining after a few minutes! But the consequences of such a poor prediction might not be so severe, and moreover, most people do not blindly rely on automated weather predictions, thus making weather forecasting a low-stake domain problem.
Similarly, for another low-stake domain, such as a content recommendation system, even if an ML algorithm provides an irrelevant recommendation, at the most, the end users might spend more time explicitly searching for relevant content. While the overall experience of the end user might be impacted, still, there is no severe loss of life or livelihood. Hence, the need for model explainability is not critical for low-stake domains, but providing explainability to model predictions does make the automated intelligent systems more trustworthy and reliable for end users, thus increasing AI adoption by enhancing the end user experience.
Now, let me give an example where the consequences of poor predictions led to a severe loss of reputation and valuation of a company, impacting many lives! In November 2021, an American online real estate marketplace company called Zillow (https://www.zillow.com/) reported having lost over 40% of its stock value, and the home-buying division Offers lost over $300 million because of its failure to detect the unpredictability of their home price forecasting algorithms (for more information, please refer to the sources mentioned in the References section). In order to compensate for the loss, Zillow had to take drastic measures of cutting down its workforce and several thousands of families were impacted.
Similarly, multiple technology companies have been accused of using highly biased AI algorithms that could result in social unrest due to racial or gender discrimination. One such incident happened in 2015 when Google Photos made a massive racist blunder by automatically tagging an African-American couple as Gorilla (please look into the sources mentioned in the References section for more information). Although these blunders were unintentional and mostly due to biased datasets or non-generalized ML models, the consequences of these incidents can create massive social, economic, and political havoc. Bias in ML models in other high-stake domains, such as healthcare, credit lending, and recruitment, continuously reminds us of the need for more transparent solutions and XAI solutions on which end users can rely.
As illustrated in Figure 1.3, the consequences of poor predictions highlight the importance of XAI, which can provide early indicators to prevent loss of reputation, money, life, or livelihood due to the failure of AI algorithms:
Now, let's try to summarize the need for model explainability in the next section.
Summarizing the need for model explainability
In the previous section, we learned that the consequences of poor predictions can impact many lives in high-risk domains, and even in low-risk domains the end user's experience can be affected. Samek and Binder's work in Tutorial on Interpretable Machine Learning, MICCAI'18, highlights the main necessity of model explainability. Let me try to summarize the key reasons why model explainability is essential:
- Verifying and debugging ML systems: As we have seen some examples where wrong model decisions can be costly and dangerous, model explainability techniques help us to verify and validate ML systems. Having an interpretation for incorrect predictions helps us to debug the root cause and provides a direction to fix the problem. We will discuss the different stages of an explainable ML system in more detail in Chapter 10, XAI Industry Best Practices.
- Using user-centric approaches to improve ML models: XAI provides a mechanism to include human experience and intuition to improve ML systems. Traditionally, ML models are evaluated based on prediction error. Using such evaluation approaches to improve ML models doesn't add any transparency and may not be robust and efficient. However, using explainability approaches, we can use human experience to verify predictions and understand whether model-centric or data-centric approaches are further needed to improve the ML model. Figure 1.4 compares a classical ML system with an explainable ML system:
- Learning new insights: ML is considered to automatically unravel interesting insights and patterns from data that are not obvious to human beings. Explainable ML provides us with a mechanism to understand the rationale behind the insights and patterns automatically picked up by the model and allows us to study these patterns in detail to make new discoveries.
- Compliance with legislation: Many regulatory bodies, such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA), have expressed severe concerns about the lack of explainability in AI. So, growing global AI regulations have empowered individuals with the right to demand an explanation of automated decision-making systems that can affect them. Model explainability techniques try to ensure ML models are compliant with proposed regulatory laws, thereby promoting fairness, accountability, and transparency.
Defining explanation methods and approaches
In this section, let's try to understand some key concepts required for understanding and applying various explainability techniques and approaches.
Dimensions of explainability
Adding to the concepts presented at MICCAI'18 from Tutorial on Interpretable Machine Learning by Samek and Binder, when we talk about the problem of demystifying black-box algorithms, there are four different dimensions through which we can address this problem, as can be seen in the following diagram:
Now, let's learn about these dimensions in detail:
- Data: The dimension of explainability revolves around the underlying data that is being modeled. Understanding the data, identifying its limitations and relevant components, and forming certain hypotheses are crucial to setting up the correct expectations. A robust data curation process, analyzing data purity, and the impact of adversarial effects on the data are other key exercises done for obtaining explainable outcomes.
- Model: Model-based interpretability techniques often help us to understand how the input data is mapped to the output predictions and be aware of some limitations and assumptions of the ML algorithms used. For example, the Naïve Bayes algorithm used for ML classification assumes that the presence of a certain feature is completely independent and unrelated to the presence of any other features. So, knowing about these inductive biases of ML algorithms helps us to understand and anticipate any prediction error or limitations of the ML models.
- Outcomes: The outcome of explainability is about understanding why a certain prediction or decision is made by an ML model. Although data and model interpretability is quite crucial, most ML experts and end-users focus on making the final model predictions interpretable.
- End users: The final dimension of explainability is all about creating the right level of abstraction and including the right amount of details for the final consumers of the ML models so that the outcomes are reliable and trustworthy for any non-technical end-user and empower them to understand the decision-making process of black-box algorithms.
Explainability to AI/ML algorithms is provided with respect to one or more dimensions of explainability. Next, let's discuss about addressing the key questions of explainability.
Addressing key questions of explainability
Now that we understand the different dimensions of explainability, let's discuss what is needed to make ML models explainable. In order to make ML algorithms explainable, the following are the key questions that we should try to address:
- What do we understand from the data?
The very first step is all about the data. Before even proceeding with the AI and ML modeling, we should spend enough time analyzing and exploring the data. The goal is always to look for gaps, inconsistencies, potential biases, or hypotheses that might impact or create challenges while modeling the data and generating the predictions. This helps us to know what is expected and how certain aspects of the data can contribute toward solving the business problem.
- How is the model created?
We need to understand how transparent the algorithm is and what kind of relationship the algorithm can capture when modeling the data during the modeling process. This is the step where we try to understand the inductive bias of the algorithms and then try to relate this to the initial hypothesis or observations obtained while exploring the data. For example, linear models will not model the data efficiently if the data has some quadratic or cycle patterns observed using visualization-based data exploration methods. The prediction error is expected to be higher. So, if it is unclear how the algorithm builds a model of the training data, these algorithms are less transparent and, hence, less interpretable.
- What do we know about the global interpretability of the trained model?
Understanding the global model interpretability is always challenging. It is about getting a holistic view of the underlying features used, knowing the important features, how sensitive the model is toward changes in the key feature values, and what kind of complex interactions are happening inside the model. This is especially hard to achieve in practice for complex deep learning models that have millions of parameters to learn and several hundreds of layers.
- What is the influence of different parts of the model on the final prediction?
Different parts of an ML model might impact the final prediction in a different way. Especially for deep neural network models, each layer tries to learn different types of features. When model predictions are incorrect, understanding how different parts of a model can affect or control the final outcome is very important. So, explainability techniques can unravel insights from different parts of a model and help debug and observe the algorithm's robustness for different data points.
- Why did the model make a specific prediction for a single record and a batch of records?
The most important aspect of explainability is understanding why the model is making a specific prediction and not something else. So, certain local and global explanation techniques are applied, which either consider the impact of individual features or even the collective impact of multiple features on the outcome. Usually, these explainability techniques are applied for single instances of the data and a batch of data instances to understand whether the observations are consistent.
- Does the outcome match the expectation of the end user?
The final step is always providing user-centric explanations. This means explainability is all about comparing the outcome with end users' predictions based on common sense and human intuition. If the model forecast matches the user's prediction, providing a reasonable explanation includes justifying the dominant factors for the specific outcome. But suppose the model forecasting is not matching the user's prediction. In that case, a good explanation tries to justify what changes could have happened in the input observations to get a different outcome.
For example, let's say, considering usual weekday traffic congestion, the time taken to reach from office to home for me is 30 minutes. But if it is raining, I would expect the vehicles on the road to move slowly and traffic congestion to be higher, and hence might expect it to take longer to reach home. Now, if an AI application predicts the time to get home as still 30 minutes, I might not trust this prediction as this is counter-intuitive.
Now, let's say that the algorithm was accurate in its forecast. However, the justification provided to me was about the movement of the vehicles on my route, and the AI app just mentioned that the vehicles on my route are moving at the same speed as on other days. Does this explanation really help me to understand the model predictions? No, it doesn't. But suppose the application mentions that there are fewer vehicles on the route than found on typical days. In that case, I would easily understand that the number of vehicles is fewer due to the rain and hence the time to destination is still the same as usual on weekdays.
My own recommendation is that, after training and validating an ML model, always try to seek answers to these questions as an initial step in interpreting the working of black-box models.
Understanding different types of explanation methods
In the previous section, we discussed some key questions to address when designing and using robust explainability methods. In this section, we will discuss various types of explanation methods, considering the four dimensions of explainability used in ML:
- Local explainability and global explainability: ML model explainability can be done for single local instances of the data to understand how a certain range of values or specific categorical value can be related to the final prediction. This is called local explainability. Global model explainability is used to explain the behavior of the entire model or certain important features as a whole that contribute toward a specific set of model outcomes.
- Intrinsic explainability and extrinsic explainability: Some ML models, such as linear models, simple decision trees, and heuristic algorithms, are intrinsically explainable as we clearly know the logic or the mathematical mapping of the input and output that the algorithm applies, whereas extrinsic or post hoc explainability is about first training an ML model on given data and then using certain model explainability techniques separately to understand and interpret the model's outcome.
- Model-specific explainability and model-agnostic explainability: When we use certain explainability methods that are applicable for any specific algorithm, then these are model-specific approaches. For example, visualization of the tree structure in decision tree models is only specific to the decision tree algorithm and hence comes under the model-specific explainability method. Model-agnostic methods are used to provide explanations to any ML model irrespective of the algorithm being used. Mostly, these are post hoc analysis methods, used after the trained ML model is obtained, and usually, these methods are not aware of the internal model structure and weights. In this book, we will mostly focus on model-agnostic explainability methods, which are not dependent on any particular algorithm.
- Model-centric explainability and data-centric explainability: Conventionally, the majority of explanation methods are model-centric, as these methods try to interpret how the input features and target values are being modeled by the algorithm and how the specific outcomes are obtained. But with the latest advancement in the space of data-centric AI, ML experts and researchers are also investigating explanation methods around the data used for training the models, which are known as data-centric explainability. Data-centric methods are used to understand whether the data is consistent, well curated, and well suited for solving the underlying problem. Data profiling, detection of data and concept drifts, and adversarial robustness are certain specific data-centric explainability approaches that we will be discussing in more detail in Chapter 3, Data-Centric Approaches.
We will discuss all these types of explainability methods in later chapters of the book.
Understanding the accuracy interpretability trade-off
For an ideal scenario, we would want our ML models to be highly accurate and highly interpretable so that any non-technical business stakeholder or end user can understand the rationale behind the model predictions. But in practice, achieving highly accurate and interpretable models is extremely difficult, and there is always a trade-off between accuracy and interpretability.
For example, to perform radiographic image classification, intrinsically interpretable ML algorithms, such as decision trees, might not be able to give efficient and generalized results, whereas more complex deep convolutional neural networks, such as DenseNet, might be more efficient and robust for modeling radiographic image data. But DenseNet is not intrinsically interpretable, and explaining the algorithm's working to any non-technical end user can be pretty complicated and challenging. So, highly accurate models, such as deep neural networks, are non-linear and more complex and can capture complex relationships and patterns from the data, but achieving interpretability is difficult for these models. Highly interpretable models, such as linear regression and decision trees, are primarily linear and less complex, but these are limited to learning only linear or less-complex patterns from the data.
Now, the question is, is it better to go with highly accurate models or highly interpretable models? I would say that the correct answer is, it depends! It depends on the problem being solved and on the consumers of the model. For high-stake domains, where the consequences of poor predictions are severe, I would recommend going for more interpretable models even if accuracy is being sacrificed. Any rule-based heuristic model that is highly interpretable can be very effective in such situations. But if the problem is well studied, and getting the least prediction error is the main goal (such as in any academic use case or any ML competitions) such that the consequences of poor prediction will not create any significant damage, then going for highly accurate models can be preferred. In most industrial problems, it is essential to keep the right balance of model accuracy and interpretability to promote AI adoption.
Figure 1.10 illustrates the accuracy-interpretability trade-off of popular ML algorithms:
Evaluating the quality of explainability methods
Explainability is subjective and may vary from person to person. The key question is How do we determine whether one approach is better than the other? So, in this section, let's discuss certain criteria to consider for evaluating explainability techniques for ML systems.
Criteria for good explainable ML systems
As explainability for ML systems is a very subjective topic, first let's try to understand some key criteria for good human-friendly explanations. In his book Interpretable Machine Learning, Christoph Molnar, the author, has also tried to emphasize the importance of good human-friendly explanations after thorough research, which I will try to mention in a condensed form considering modern, industrial, explainable ML systems:
- Coherence with a priori knowledge: Consistency with prior beliefs of end users is an important criterion of explainable ML systems. If any explanation contradicts a priori knowledge of human beings, then humans tend to have less trust in such explanations. However, it is challenging to introduce prior knowledge of humans into ML models. But human-friendly explainable ML systems should try to provide explanations surrounding certain features that have direct and less complex relationships with the outcome, such that the relationship is coherent with the prior beliefs of the end users.
For example, for predicting the presence of diabetes, the measure of blood glucose level has a direct relationship, which is consistent with prior human beliefs. If the blood glucose level is higher than usual, this might indicate diabetes for the patient, although diabetes can also be due to certain genetic factors or other reasons. Similarly, high blood glucose levels can also be momentary and a high blood glucose level doesn't always mean that the patient has diabetes. But as the explanation is consistent with a priori knowledge, end users will have more trust in such explanations.
- Fidelity: Another key factor for providing a holistic explanation is the truthfulness of the explanation, which is also termed the fidelity of ML models. Explanations with high fidelity can approximate the holistic prediction of the black-box models, whereas low-fidelity explanations can interpret a local data instance or a specific subset of the data. For example, for doing sales forecasting, providing explanations based on just the trend of historical data doesn't give a complete picture, as other factors, such as production capacity, market competition, and customer demand, might influence the outcome of the model. Fidelity plays a key role, especially for doing a detailed root cause analysis, but too many details may not be useful for common users, unless requested.
- Abstraction: Good human-friendly explanations are always expected in a concise and abstract format. Too many complicated details can also impact the experience of end users. For example, for weather forecasting, if the model predicts a high probability of rain, the concise and abstract explanation can be that it is cloudy now and raining within 5 kilometers of the current location, so there is a high probability of rain.
But if the model includes details related to precipitation level, humidity, and wind speed, which might also be important for the prediction of rainfall, these additional details are complex and difficult to comprehend, and hence not human-friendly. So, good, human-friendly explanations include the appropriate amount of abstraction to simplify the understanding for the end users. End users mostly prefer concise explanations, but detailed explanations might be needed when doing root cause analysis for model predictions.
- Contrastive explanations: Good, human-friendly explanations are not about understanding the inner workings of the models but mostly about comparing the what-if scenarios. Suppose the outcome is continuous numerical data as in the case of regression problems. In that case, a good explanation for predictions includes comparing with another instance's prediction that is significantly higher or lower. Similarly, a good explanation for classification problems is about comparing the current prediction with other possible outcomes. But contrastive explanations are application-dependent as it requires a point of comparison, although understanding the what-if scenarios helps us to understand the importance of certain key features and how these features are related to the target variable.
For example, for a use case of employee churn prediction, if the model predicts that the employee is likely to leave the organization, then contrastive explanations try to justify the model's decision by comparing it with an instance's prediction where the employee is expected to stay in the organization and compare the values of the key features used to model the data. So, the explanation method might convey that since the salary of the employee who is likely to leave the organization is much lower than that of employees who are likely to stay within the organization, the model predicted that the employee is expected to leave the organization.
- Focusing on the abnormality: This may sound counter-intuitive, but human beings try to seek explanations for events that are not expected and not obvious. Suppose there is an abnormal observation in the data, such as a rare categorical value or an anomalous continuous value that can influence the outcome. In that case, it should be included in the explanation. Even if other normal and consistent features have the same influence on the model outcome, still including the abnormality holds higher importance in terms of human-friendly explanation.
For example, say we are predicting the price of cars based on their configuration, let's say the mode of operation is electric, which is a rare observation compared to gasoline. Both of these categories might have the same influence on the final model prediction. Still, the model explanation should include the rare observation, as end users are more interested in abnormal observations.
- Social aspect: The social aspect of model explainability determines the abstraction level and the content of explanations. The social aspect depends on the level of understanding of the specific target audience and might be difficult to generalize and introduce during the model explainability method. For example, suppose a stock-predicting ML model designed to prescribe actions to users suggests shorting a particular stock. In that case, end users outside the finance domain may find it difficult to understand. But instead, if the model suggests selling a stock at the current price without possessing it and buying it back after 1 month when the price is expected to fall, any non-technical user might comprehend the model suggestions easily. So, good explanations consider the social aspect, and often, user-centric design principles of Human-Computer Interaction (HCI) are utilized to design good, explainable ML systems that consider the social aspect.
Auxiliary criteria of XAI for ML systems
Good explanations are not limited to the key criteria discussed previously, but there are a few auxiliary criteria of XAI as discussed by Doshi-Velez and Kim in their work Towards A Rigorous Science of Interpretable Machine Learning:
- Unbiasedness: Model explainability techniques should also look for the presence of any form of bias in the data or the model. So, one of the key goals of XAI is to make ML models unbiased and fair. For example, for predicting credit card fraud, the explainability approach should investigate the importance of demographic information related to the gender of the customer for the model's decision-making process. If the importance of gender information is high, that means that the model is biased toward a particular gender.
- Privacy: Explainability methods should comply with data privacy measures, and hence any sensitive information should not be used for the model explanations. Mainly for providing personalized explanations, ensuring compliance with data privacy can be very important.
- Causality: Model explainability approaches should try to look for any causal relationships so that the end users are aware that due to any perturbation, there can be changes in model predictions for production systems.
- Robustness: Methods such as sensitivity analysis help to understand how robust and consistent a model prediction is with respect to its feature values. If small changes in input features lead to a significant shift in model predictions, it shows that the model is not robust or stable.
- Trust: One of the key goals of XAI is to increase AI adoption by increasing the trust of the end users. So, all explainability methods should make black-box ML models more transparent and interpretable so that the end users can trust and rely on them. If explanation methods don't meet the criteria of good explanations, as discussed in the Criteria for good explainable ML systems section, it might not help to increase the trust of its consumers.
- Usable: XAI methods should try to make AI models more usable. Hence, it should provide information to the users to accomplish the task. For example, counterfactual explanations might suggest a loan applicant pays their credit card bill on time for the next 2 months and clear off their previous debts before applying for a new loan so that their loan application is not rejected.
Next, we will need to understand various levels of evaluating explainable ML systems.
Taxonomy of evaluation levels for explainable ML systems
Now that we have discussed the key criteria for designing and evaluating good explainable ML systems, let's discuss the taxonomy of evaluation methods for judging the quality of explanations. In their work Towards A Rigorous Science of Interpretable Machine Learning, Doshi-Velez and Kim mentioned three major types of evaluation approaches that we will try to understand in this section. Since explainable ML systems are to be designed with user-centric design principles of HCI, human beings evaluating real tasks play a central role in assessing the quality of explanation.
But human evaluation mechanisms can have their own challenges, such as different types of human bias and being more time- and other resource-consuming, and can have other compounding factors that can lead to inconsistent evaluation. Hence, human evaluation experiments should be well designed and should be used only when needed and otherwise not. Now, let's look at the three major types of evaluation approaches:
- Application-grounded evaluation: This evaluation method involves including the explainability techniques in a real product or application, thus allowing the conduction of human subject experiments, in which real end users are involved to perform certain experiments. Although the experiment setup cost is high and time-consuming, building an almost finished product and then allowing domain experts to test has its benefits. It will enable the researcher to evaluate the quality of the explanation with respect to the end task of the system, thus providing ways to quickly identify errors or limitations of the explainability methods. This evaluation principle is consistent with the evaluation methods used in HCI, and explainability is infused within the entire system responsible for solving the user's keep problem and helping the user meet its end goal.
For example, to evaluate the quality of explanation methods of an AI software for automated skin cancer detection from images, dermatologists can be approached to directly test the objective for which the AI software is built. If the explanation methods are successful, then such a solution can be scaled up easily. In terms of the industrial perspective, since getting a perfect finished product can be time-consuming, the better approach is to build a robust prototype or a Minimum Viable Product (MVP) so that the domain expert testing the system gets a better idea of how the finished product will be.
- Human-grounded evaluation: This evaluation method involves conducting human subject experiments with non-expert novice users on more straightforward tasks rather than domain experts. Getting domain experts can be time-consuming and expensive, so human-grounded evaluation experiments are easier and less costly to set up. The tasks are also simplified sometimes and usually for certain use cases where generalization is important, these methods are very helpful. A/B testing, counterfactual simulations, and forward simulations are certain popular evaluation methods used for human-grounded evaluation.
In XAI, A/B testing provides different types of explanations to the user, where the user is asked to select the best one with the higher quality of explanation. Then, based on the final aggregated votes, and using other metrics such as click-through rate, screen hovering time, and time to task completion, the best method is decided.
For counterfactual simulation methods, human subjects are presented with the input and output of the model with the model explanations for a certain number of data samples and are asked to provide certain changes to the input features in order to change the model's final outcome to a specific range of values or a specific category. In the forward simulation method, human subjects are provided with the model inputs and their corresponding explanation methods and then asked to simulate the model prediction without looking at the ground-truth values. Then, the error metric used to find the difference between human-predicted outcomes with the ground-truth labels can be used as a quantitative way to evaluate the quality of explanation.
- Functionality-grounded evaluation: This evaluation method doesn't involve any human subject experiments, and proxy tasks are used to evaluate the quality of explanation. These experiments are more feasible and less expensive to set up than the other two, and especially for use cases where human subject experiments are restricted and unethical, this is an alternative approach. This approach works well when the type of algorithm was already tested in human-level evaluation.
For example, linear regression models are easily interpretable and end users can efficiently understand the working of the model. So, using a linear regression model for use cases such as sales forecasting can help us to understand the overall trend of the historical data and how the forecasted values are related to the trend.
Figure 1.9 summarizes the taxonomy of the evaluation level for explainable ML systems:
Apart from the methods discussed here, for determining the quality of explanations, other metrics such as description length of explanation, the complexity of the features used in the explanation, and cognitive processing time required to understand the provided explanation, are often used. We will discuss them in more detail in Chapter 11, End User-Centered Artificial Intelligence.
In this chapter, so far you have come across many new concepts for ML explainability techniques. The mind-map diagram in Figure 1.10 gives a nice summary of the various terms and concepts discussed in this chapter:
I strongly recommend all of you make yourselves familiar with the jargon used in this mind-map diagram as we will be using it throughout the book! Let's summarize what we have discussed in the summary section.
After reading this chapter, you should now understand what XAI is and why it is so important. You have learned about the various terms and concepts related to explainability techniques, which we will frequently use throughout this book. You have also learned about certain key criteria of human-friendly explainable ML systems and different approaches to evaluating the quality of the explainability techniques. In the next chapter, we will focus on various types of model explainability methods for structured and unstructured data.
Please refer to the following resources to gain additional information:
- The $300m flip flop: how real-estate site Zillow's side hustle went badly wrong, theguardian.com: https://www.theguardian.com/business/2021/nov/04/zillow-homes-buying-selling-flip-flop
- Zillow says it's closing homebuying business, cutting 25% of workforce; earnings miss estimates, cnbc.com: https://www.cnbc.com/2021/11/02/zillow-shares-plunge-after-announcing-it-will-close-home-buying-business.html
- Google Photos Tags Two African-Americans As Gorillas Through Facial Recognition Software, forbes.com: https://www.forbes.com/sites/mzhang/2015/07/01/google-photos-tags-two-african-americans-as-gorillas-through-facial-recognition-software/
- Google apologises for Photos app's racist blunder, bbc.com: https://www.bbc.com/news/technology-33347866
- Tutorial on Interpretable Machine Learning, MICCAI'18, by Samek and Binder: http://www.heatmapping.org/slides/2018_MICCAI.pdf
- Chapter 1 – Interpretation, Interpretability, and Explainability and Why Does It All Matter? by Serg Masis, Packt Publishing Ltd.: https://www.amazon.com/Interpretable-Machine-Learning-Python-hands/dp/180020390X
- Interpretable Machine Learning, Christoph Molnar: https://christophm.github.io/interpretable-ml-book/
- Towards A Rigorous Science of Interpretable Machine Learning by Doshi-Velez and Kim: https://arxiv.org/abs/1702.08608