Explainable and Ethical AI Primer

“The greatest thing by far is to be a master of metaphor; it is the one thing that cannot be learnt from others; and it is also a sign of genius, since a good metaphor implies an intuitive perception of the similarity in the dissimilar.”

– Aristotle

“Ethics is in origin the art of recommending to others the sacrifices required for cooperation with oneself.”

– Bertrand Russell

“I am in the camp that is concerned about super intelligence.”

– Bill Gates

“The upheavals [of artificial intelligence] can escalate quickly and become scarier and even cataclysmic. Imagine how a medical robot, originally programmed to rid cancer, could conclude that the best way to obliterate cancer is to exterminate humans who are genetically prone to the disease.”

– Nick Bilton, tech columnist for The New York Times

This introductory chapter presents a detailed overview of the key terms related to explainable and interpretable AI that paves the way for further reading.

In this chapter, you will get familiar with safe, ethical, explainable, robust, transparent, auditable, and interpretable machine learning terminologies. This should provide both a solid overview for novices and serve as a reference to experienced machine learning practitioners.

This chapter covers the following topics:

Building the case for AI governance
Key terminologies – explainability, interpretability, fairness, explicability, safety, trustworthiness, and ethics
Automating bias – the network effect
The case for explainability and black-box apologetics

Artificial intelligence (AI) and machine learning have significantly changed the course of our lives. The technological advancements aided by their capabilities have a deep impact on our society, economy, politics, and virtually every spectrum of our lives. COVID-19, being the de facto chief agent of transformation, has dramatically increased the pace of how automation shapes our modern enterprises. It would be both an understatement and a cliché to say that we live in unprecedented times.

The increased speed of transformation, however, doesn’t come without its perils. Handing things out to machines has its inherent cost and challenges; some of these are quite obvious, while other issues become apparent as the given AI system is used, and some, possibly many, have yet to be discovered. The evolving future of the workplace is not only based on automating mundane, repetitive, and dangerous jobs but also on taking away the power of human decision-making. Automation is rapidly becoming a proxy for human decision-making in a variety of ways. From providing movies, news, books, and product recommendations to deciding who can get paroled or get admitted to college, machines are slowly taking away things that used to be considered uniquely human. Ignoring the typical doomsday elephants in the room (insert your favorite dystopian cyborg movie plot here), the biggest threat of these technological black boxes is the amplification and perpetuation of systemic biases through AI models.

Typically, when a human bias gets introduced, perpetuated, or reinforced among individuals, for the most part, there are opposing factors and corrective actions within society to bring some sort of balance and also limit the widescale spread of such unfairness or prejudice. While carefully avoiding the tempting traps of social sciences, politics, or ethical dilemmas, purely from a technical standpoint, it is safe to say that we have not seen experimentation at this scale in human history. The narrative can be subtle, nudged by models optimizing their cost functions, and then perpetuated by either reinforcing ideas or the sheer reason of utility. We have repeatedly seen that humans will trade privacy for convenience – anyone accepting End User Licensing Agreements (EULAs) without ever reading them, feel free to put your hands down.

While some have called for a pause in the advancement of cutting-edge AI while governments, industry, and other relevant stakeholders globally seek to ensure AI is fully understood and accordingly controlled, this does not help those in an enterprise who wish to benefit from less contentious AI systems. As enterprises mature in the data and AI space, it is entirely possible for them to ensure that the AI they develop and deploy is safe, fair, and ethical. We believe that, as policymakers, executives, managers, developers, ethicists, auditors, technologists, designers, engineers, and scientists, it is crucial for us to internalize the opportunities and threats presented by modern-day digital transformation aided by AI and machine learning. Let’s dive in!

The imperative of AI governance

“Starting Jan 1st 2029, all manual, and semi-autonomous operating vehicles on highways will be prohibited. This restriction is in addition to pedestrians, bicycles, motorized bicycles, and non-motorized vehicle traffic. Only fully autonomous land vehicles compliant with intelligent traffic grid are allowed on the highways.”

– Hill Valley Telegraph, June 2028

Does this headline look very futuristic? Probably a decade ago, but today, you could see this as a reality in 5 to 10 years. With the current speed of automation, humans behind the wheel of vehicles weighing thousands of pounds would sound irresponsible in the next 10 years. Human driving will quickly become a novelty sport, as thousands of needless vehicle crash deaths caused by human mistakes can be avoided, thanks to self-driving vehicles.

Figure 1.1: The upper row shows an image from the validation set of Cityscapes and its prediction. The lower row shows the image perturbed with universal adversarial noise and the resulting prediction. Image Courtesy Metzen et al – Universal Adversarial Perturbations Against Semantic Image Segmentation – source: https://arxiv.org/pdf/1704.05712.pdf

As we race toward delegating decision-making to algorithms, we need to ask ourselves whether we have the capability to clearly understand and justify how an AI model works and predicts. It might not be important to fully interpret how your next Netflix movie has been recommended, but when it comes to the critical areas of human concerns such as healthcare, recruitment, higher education admissions, legal, commercial aircraft collision avoidance, financial transactions, autonomous vehicles, or control of massive power generating or chemical manufacturing plants, these decisions are critical. It is pretty self-explanatory and logical that if we can understand what algorithms do, we can debug, improve, and build upon them easily. Therefore, we can extrapolate that in order to build an ethical AI – an AI that is congruent with our current interpretation of ethics – explainability would be one of the must-have features. Decision transparency, or understanding why an AI model predicts what it predicts, is critical to building a trustworthy and reliable AI system. In the preceding figure, you can see how an adversarial input can change the way an autonomous vehicle sees (or does not see) pedestrians. If there is an accident, an algorithm must be able to explain its action clearly in the state when the input was received – in an auditable, repeatable, and reproducible manner.

AI governance and model risk management are essential in today’s world, where AI is increasingly being used to make critical decisions that affect individuals and society as a whole. Without proper governance and risk management, AI systems could be biased, inaccurate, or unethical, leading to negative outcomes and loss of public trust. By ensuring that AI is developed, deployed, and used in a responsible and ethical manner, we can leverage its full potential to improve lives, advance research, and drive innovation. As AI researchers and practitioners, we have a responsibility to prioritize governance and risk management to create a better, more equitable future for everyone. This means that to have a safe, reliable, and trustworthy AI for human use, it must be safe, transparent, explainable, justifiable, robust, and ethical.

We have been using lots of big words, so let’s define what these terms really mean.

Key terminologies

Definitions are hard. Just ask Arvind Narayanan, associate professor of computer science at Princeton, whose aptly titled tutorial 21 fairness definitions and their politics 1 was a highlight at the Conference on Fairness, Accountability, and Transparency (FAT*). In his tutorial, Narayanan discussed the various fairness definitions in the context of machine learning and algorithmic decision-making, as well as the political and ethical implications of these definitions. By exploring 21 different fairness definitions, Narayanan aimed to demonstrate that fairness is a context-dependent, multifaceted concept that often requires careful consideration of ethical and societal values. The tutorial emphasized the importance of understanding the assumptions, trade-offs, and limitations associated with each definition, and he urged designers of algorithms to make informed decisions about which fairness definitions are most appropriate for a particular context.

As we attempt to define ethical AI, it is crucial to identify several core and contextual components. Ethical AI should be explainable, trustworthy, safe, reliable, robust, auditable, and fair, among numerous other aspects. Formal methods and definitions involve the use of accurate mathematical modeling and reasoning to draw rigorous conclusions. The challenge of formally defining explainability will soon become apparent – while there is a formal definition to verify a model’s adherence to differential privacy, quantifying explainability, trust, and ethics proves more nuanced. Consequently, the definitions presented here are imperfect representations of our current understanding of the subject. As taxonomies evolve and underlying semantics shift, we will strive to clarify some of the key terms to provide a clearer picture.

Explainability

Explainability refers to the ability of a machine learning algorithm to provide clear and understandable explanations for its decision-making process. While deep learning has made significant strides in areas such as computer vision and natural language processing, these models are often viewed as “black boxes” because their decision-making process is not always transparent. This lack of transparency can be a significant barrier to the adoption of deep learning models in certain areas, such as healthcare and finance, where the consequences of algorithmic decisions can be significant. As a result, developing methods to explain the reasoning of these models is critical for their wider adoption and success.

Explainability is one of those “-ilities” or non-functional requirements3 – the quality of being explainable, 4 such as being capable of giving the reason for our cause. Explainability, therefore, can be the ability to provide a reason or justification for an action or belief.

In simple terms, we can infer that if an event is explainable, it provides sufficient information to draw a conclusion as to why a particular decision was made. Explainable to whom? To a human. Although it’s preferable if it’s possible, this doesn’t have to be a layperson. Explainable to a subject-matter expert (SME) is fine. The SME themselves can both assure non-expert users and explain to them why a machine made such a decision in a less technical manner. Human understanding is critical. Explainability is mandatory and required by law in certain protected domains, such as finance and housing.

Interpretability

Interpretability is another very closely related concept that is typically used interchangeably with explainability, but there are some subtle differences, which we will discuss shortly. Lipton did a detailed analysis to address model properties and techniques thought to confer interpretability and decided that, at present, interpretability has no formal technical meaning – well, that’s not very helpful. Informally, interpretability directly correlates with understandability or intelligibility (of a model) so that we as humans can understand how it works. Understandable models are transparent or white-box/glass-box models, whereas incomprehensible models are considered black boxes.

For the purpose of this discourse, interpretability is generally seen as a subset of explainability. Interpretability refers to the ability to understand the specific features or inputs that a model uses to make its predictions.

A system can be interpretable if we can find and illustrate cause and effect. An example would be the weather temperature on crop yields. The crop will have an optimum temperature for its highest yields, so we can use temperature as a predictor (feature) in the crop yield (target variable). However, the relationship between the temperature and the crop yield will not be explainable until an understanding of the bigger picture is in place. In the same vein, a model can be transparent without being explainable. For instance, we can clearly see the following prediction function:

Predict(x1, x2) > y′ (1.1)

However, if we don’t know much about hyperparameters x1 and x2:

x1 and x2 (1.2)

which might be a combination of several real-world features, the model is not explainable.

Also, a model can be explainable, transparent, and still biased. Explainability is not a guarantee of fairness, safety, trust, or bias. It just ensures that you, as a human SME, can understand the model.

Explicability

The two terms explainability and explicability may appear the same, but in this context, they do differ. Explicability is the broader term, referring to the concept of transparency, communication, and understanding in machine learning, while explainability refers to the ability to provide clear and understandable reasons for how a given machine learning model makes its decisions.

Explicability is a term typically used in regulations and related governance documents. It literally means “capable of being explained” and it is deemed crucial to build and maintain users’ trust in AI systems by EU Ethical guidelines 5.

Does a safe system have to be explainable? In our opinion, yes, absolutely. While there is an ongoing discussion among researchers on this topic, the first-ever “great AI debate” at the Neural Information Processing Systems (NeurIPS) conference was about how interpretability is necessary for machine learning.

Note

At the time of writing, this debate has moved on. Since the launch of ChatGPT in late 2022 by OpenAI, there has been increasing awareness at governmental levels regarding the importance of AI assurance and regulatory guardrails. It seems likely that an international body overseeing AI regulation will be established. If this does not happen, individual countries and trading groups will establish and govern AI at these levels.

Safe and trustworthy

AI safety is an area that deals with nonfunctional requirements, such as reliability, robustness, and assurance. An AI system is deemed safe and trustworthy if it exhibits reliability, meaning that it acts within the desired ranges of outputs, even when the inputs are new, in and around edge conditions. It also has to be robust, be able to handle adversarial inputs (as shown in Figure 1.1), and not be gullible and easily fooled, providing high confidence predictions for unrecognizable images7.

This debate highlights an ongoing discussion in the machine learning community about the trade-off between performance and interpretability. The participants, Rich Caruana and Patrice Simard, argued that interpretability is essential to understand the reasoning behind machine learning models and ensure their responsible use, while Kilian Weinberger and Yann LeCun argued that performance should be the main focus of machine learning research. Interpretability can sometimes compromise performance and may not be possible in highly complex deep learning models. The participants argued that explainable and interpretable machine learning models are essential to build trust and ensure the responsible use of AI in society (The Great AI Debate – NIPS2017 8).

A safe system should also be auditable, meaning it must be transparent to verify the internal state when the decision was made. This auditability is particularly important within regulated industries, such as health and finance, where those seeking to use AI for given applications will need to always be able to prove to a regulator that the machine learning models underpinning the AI meet the required regulatory standards for AI.

The system and processes used within an enterprise to monitor the internal state of machine learning models and their underlying data must also be auditable. This ensures that tracing back to the AI components is possible, enabling a retrospective review such as root-cause analysis in a reliable manner. Such audit processes are increasingly being codified and built into enterprise MLOps platforms.

Privacy and security are also key components of a safe and trustworthy AI system. User data has specific contexts, needs, and expectations and should be protected accordingly during its entire life cycle.

Stanford Center for AI Safety (http://aisafety.stanford.edu/) focuses on developing rigorous techniques to build safe and trustworthy AI systems and establish confidence in their behavior and robustness. This Stanford Center for AI Safety white paper (https://aisafety.stanford.edu/whitepaper.pdf) by Kochenderfer, et al provides a great overview of AI safety and its related aspects, and it makes for good reading.

Fairness

Fairness in machine learning systems refers to the principle that decisions made by these systems should not discriminate or be biased against individuals or groups based on their race, gender, ethnicity, religion, or other personal characteristics. Fairness is about not showing implicit bias or unintended preference toward specific subgroups, features, or inputs. We mentioned previously a detailed tutorial on 21 fairness definitions and their politics9 at the Conference on Fairness, Accountability, and Transparency 10, but we will adhere to the EU’s draft guidelines, which correlate fairness with ensuring an equal and just distribution of both benefits and costs, ensuring that individuals and groups are free from unfair bias, discrimination, and stigmatization.

Microsoft’s Melissa Holland, in her post about our shared responsibility for AI, 11 defines fairness as follows:

“AI Models should treat everyone in a fair and balanced manner and not affect similarly situated groups of people in different ways.”

Machines may learn to discriminate for of a variety of reasons, including skewed samples, tainted examples, limited features, sample size, disparity, and proxies. This can lead to disparate treatment of the users. As the implicit bias seeps into the data, this can lead to serious legal ramifications, especially in regulated domains such as credit (Equal Credit Opportunity Act), education (Civil Rights Act of 1964 and Education Amendments of 1972), employment (Civil Rights Act of 1964), housing (Fair Housing Act), and public accommodation (Civil Rights Act of 1964). The protected classes that cannot be discriminated against include race (Civil Rights Act of 1964), color (Civil Rights Act of 1964), sex (Equal Pay Act of 1963 and Civil Rights Act of 1964), religion (Civil Rights Act of 1964), national origin (Civil Rights Act of 1964), citizenship (Immigration Reform and Control Act), age (Age Discrimination in Employment Act of 1967), pregnancy (Pregnancy Discrimination Act), familial status (Civil Rights Act of 1968), disability status (Rehabilitation Act of 1973 and Americans with Disabilities Act of 1990), veteran status (Vietnam Era Veterans’ Readjustment Assistance Act of 1974 and Uniformed Services Employment and Reemployment Rights Act), and genetic information (Genetic Information Nondiscrimination Act). In addition to the laws in the United States, there are also international laws aimed at ensuring fairness, such as the European Union’s General Data Protection Regulation (GDPR), which mandates that automated decision-making systems do not lead to discriminatory or unjust outcomes. The Equality Act of 2010 in the United Kingdom prohibits discrimination based on protected characteristics, which encompass age, disability, gender reassignment, marriage and civil partnership, pregnancy and maternity, race, religion or belief, sex, and sexual orientation. These international laws are designed to prevent discrimination and promote fairness in machine learning systems.

In the context of Arvind Narayanan’s tutorial, an example of the incompatibility of different fairness definitions is illustrated using two fairness metrics – statistical parity (P(Yˆ = 1|A = a) = P(Yˆ = 1) for all a ∈ {0, 1}) and equalized odds (P(Yˆ = 1|Y = y, A = a) = P(Yˆ = 1|Y = y) for all a ∈ {0, 1} and y ∈ {0, 1}). These definitions can be incompatible when the base rates of positive outcomes in the two demographic groups are different. In such a scenario, it is not possible to satisfy both definitions simultaneously, as adjusting the algorithm to achieve statistical parity might result in unequal true positive rates and false positive rates across groups, violating equalized odds. Conversely, ensuring equalized odds can lead to a different proportion of positive outcomes between the groups, violating statistical parity. This example demonstrates that satisfying multiple fairness definitions at the same time may not always be possible, highlighting the need for careful consideration of trade-offs and context when selecting appropriate fairness definitions.

In practice, the fairness of an AI system also has a lot to do with accountability – “the ability to contest and seek effective redress against decisions made by AI systems and by the humans operating them.” The EU’s ethics guidelines for trustworthy AI 12 recommend holding the unfair entity identifiable and accountable. The entity accountable for the decision must be identifiable, and the decision-making processes should be explicable.

Ethics

Ethics are at the core of responsible AI development. Ethics in machine learning fairness refers to the set of principles and values that guide the development and use of machine learning systems to ensure that they are just, equitable, and unbiased. This includes ensuring that machine learning models are developed using representative and unbiased datasets, that the features used in a model are relevant and fair, and that algorithms are evaluated for any unintended consequences or biases.

Ethics are defined as “moral principles that govern a person’s behavior or the conducting of an activity” (Oxford English Dictionary 13). The goal of ethics in machine learning fairness is to ensure that these systems are designed and deployed in a way that is consistent with our values, and that they promote the well-being of society as a whole. This includes considering the potential impacts of these systems on different groups of people and ensuring that they do not perpetuate or exacerbate existing inequalities and biases. Morals often describe your particular values concerning what is right and what is wrong. While ethics can refer broadly to moral principles, you often see it applied to questions of correct behavior within a relatively narrow area of activity.

Even though used interchangeably, morals are the individual beliefs about what is right or wrong, while ethics are a set of principles and values that are shared by a group or profession and are intended to guide behavior in a particular context – hence, instead of “moral-AI,” it makes sense to strive and build ethical AI practices to ensure that machine learning systems are designed and deployed in a way that is both technically sound and socially responsible.

In the following sections, you will see several definitions of what constitutes an ethical AI. Despite the growing attention to ethical considerations in AI, there is still no clear consensus on what constitutes “ethical AI.” This lack of agreement is due to a number of factors – the rapidly evolving nature of AI technologies, the complexity of the ethical issues involved, and the diverse range of stakeholders with differing interests and values.

This raises an important question, as posed by Gray Scott, an expert in the philosophy of technology, digital consciousness, and humanity’s technological advancements:

“The real question is, when will we draft an AI bill of rights? What will that consist of? And who will get to decide that?”

Eileen Chamberlain Donahoe, the executive director of the Global Digital Policy Incubator at Stanford University’s Center for Democracy and the first US ambassador to the United Nations Human Rights Council, offers a potential answer to the question of AI ethics and safety standards that are both enforceable and accountable. According to Donahoe, the answer may already be found in the Universal Declaration of Human Rights (UDHR) and a series of international treaties that outline the civil, political, economic, social, and cultural rights envisioned by the UDHR. This perspective has a wide global consensus and could be suitable for the purpose of regulating AI in the short term.

Transparency

Model transparency refers to the ability to understand and explain how a machine learning model works and how it arrived at its predictions or decisions.

Model transparency, explainability, and interpretability are related but distinct concepts in responsible AI. Model transparency refers to the degree of visibility and understandability of a model’s inner workings, including input, output, and processing steps. Model explainability aims to provide human-understandable reasons for a model’s output, while model interpretability goes deeper to allow humans to understand a model’s internal processes. Achieving model transparency can involve methods such as model interpretation, data and process transparency, and clear documentation. While all three concepts are important in responsible AI, not all transparent or explainable models are necessarily interpretable.

Keeping humans in the loop for decision support systems

Imagine the following conversation:

Physician: “We believe the best course of action for you requires surgery, and this may lead to amputation of your leg.”

Patient: “Really? That’s quite bleak, but why?”

Physician: “Because, well, mainly because our treatment algorithm said so!”

As you can imagine, this conversation is unlikely to go smoothly. Without specific details about why surgery is necessary, along with case studies, assurance of potentially high success rates (with caveats, of course), and empathetic human reinforcement, the patient will likely remain unconvinced.

That’s why keywords such as augmentation and support play crucial roles, as they emphasize the importance of human involvement in heavily regulated and human-centric systems. While a model providing recommendations may be acceptable in many situations, it cannot wholly replace human decision-making. The complete autonomy of AI models may be challenging to accept due to potential regulatory, compliance, or legal consequences. It is essential to keep humans in the loop for oversight and reinforcement of correct behavior, at least for now, to ensure that AI is used responsibly and ethically.

Model governance

Model governance refers to the process of managing and overseeing the development, deployment, and maintenance of machine learning models in an organization. It involves setting policies, standards, and procedures to ensure that models are developed and used in a responsible, ethical, and legally compliant way.

Model governance is necessary because machine learning models can have significant impacts on individuals, businesses, and society as a whole. Models can be used to make decisions about credit, employment, healthcare, and other critical areas, so it is important to ensure that they are reliable, accurate, and fair.

The key components of model governance include the following:

Model inventory and documentation: Keeping an up-to-date inventory of all models in use and their relevant documentation, including details about their data sources, training methodologies, performance metrics, and other relevant information
Model monitoring and performance management: Monitoring models in production to ensure that they continue to perform as expected, and implementing systems to manage model performance, such as early warning systems and automated retraining
Model life cycle management: Establishing clear processes and workflows for the entire life cycle of a model, from development to decommissioning, including procedures for model updates, versioning, and retirement
Model security and data privacy: Ensuring that models and their associated data are secure and protected against cyber threats and that they comply with relevant data privacy regulations, such as GDPR and CCPA
Model interpretability and explainability: Implementing methods to ensure that models are interpretable and explainable, enabling users to understand how a model works and how it arrived at its output
Model bias and fairness management: Implementing measures to identify and mitigate bias in models and ensure that models are fair and unbiased in their decision-making
Model governance infrastructure and support: Establishing an organizational infrastructure and providing the necessary support, resources, and training to ensure effective model governance, including dedicated teams, governance policies, and training programs

Enterprise risk management and governance

In this section, we will discuss how the monitoring and management of risk associated with AI should be recognized as one part of an enterprise’s risk management and governance framework.

Given the relative youth of the use of AI within a business (compared to, say, offices, computers, and data warehouses), the risk management of AI is not necessarily an established process for many enterprises. While regulated business sectors such as financial services and healthcare will be familiar with ensuring their machine learning models adhere to a regulator’s rules, this will not be the case for other enterprises in other, currently unregulated, business areas.

Enterprise risk governance is a critical process that involves identifying, assessing, and managing risks throughout an organization or enterprise. It requires implementing effective policies, procedures, and controls to mitigate risks and ensure that the organization operates in a safe, secure, and compliant manner.

The primary objective of enterprise risk governance is to enable an organization to develop a comprehensive understanding of its risks and manage them effectively. This encompasses identifying and assessing risks related to the organization’s strategic objectives, financial performance, operations, reputation, and compliance obligations. Establishing a risk management framework is a typical approach to enterprise risk governance, which involves developing policies and procedures for risk identification, assessment, and mitigation. It also involves assigning responsibility for risk management to specific individuals or teams within the organization.

To maintain effective enterprise risk governance, the ongoing monitoring and evaluation of risk management practices are necessary. This ensures that an organization can respond to emerging risks promptly and efficiently. Furthermore, regular reporting to stakeholders such as executives, board members, and regulators is vital to ensure they are informed about the organization’s risk profile and risk management activities.

Tools for enterprise risk governance

There are several enterprise risk governance frameworks and tools available to help organizations implement effective risk management practices. One commonly used framework is the ISO 31000:2018 standard, which provides guidelines for risk management principles, frameworks, and processes. Other frameworks include COSO’s ERM (Enterprise Risk Management) and the NIST Cybersecurity Framework. There is also COBIT (Control Objectives for Information and Related Technology), ITIL (Information Technology Infrastructure Library), and PMBOK (Project Management Body of Knowledge), which provide guidance to manage risks related to information technology, service management, and project management, respectively.

Risk management tools, such as risk registers, risk heat maps, and risk scoring models, can also be used to help organizations identify and assess risks. These tools can help prioritize risks based on their likelihood and potential impact, enabling organizations to develop appropriate risk mitigation strategies.

Technology solutions, such as GRC (governance, risk, and compliance) platforms, can also aid in enterprise risk governance by providing a centralized system to manage risks and ensure compliance with relevant regulations and standards. AI-powered risk management tools are also becoming increasingly popular, as they can help organizations identify and mitigate risks more efficiently and effectively.

AI risk governance in the enterprise

Within an enterprise, AI risk governance is the set of processes that ensures the use of AI does not have a detrimental impact on the business in any way. There are a significant number of ways this could happen, including the following:

Ensuring AI used in selection processes such as automated sifting of job candidates within HR is unbiased and does so without any kind of prejudice
Automated defect monitoring of a manufacturing process in a tire factory does not accept defective tire walls (or, conversely, reject sufficient tire walls) due to drift in the underlying ML model
Credit is refused to an applicant of a loan company, as a credit-risk model inappropriately rejects on the grounds of their employment type

These are just three examples; there are many more. Such adverse outcomes can potentially cause harm to a business, its customers, and other stakeholders, and at the very least, it can have a reputational impact on the business.

Enterprise risk management is all about managing the risks (ideally, before they become issues) in order to yield business benefits, and AI is no different. AI risk governance is a crucial process that involves managing and mitigating the risks that arise from the development and deployment of AI models within an organization or enterprise. Although the use of AI technologies in business processes can result in significant benefits, it can also introduce new risks and challenges that require prompt attention.

Effective enterprise AI risk governance entails identifying and assessing potential risks associated with the use of AI, including data privacy concerns, algorithmic bias, cybersecurity threats, and legal and regulatory compliance issues. Furthermore, it involves implementing policies, procedures, and technical safeguards to manage these risks, such as model explainability and transparency, data governance, and robust testing and validation processes.

By adopting a sound enterprise AI risk governance strategy, organizations can ensure that their AI technologies are deployed safely and responsibly. Such governance practices ensure that AI models are transparent, auditable, and accountable, and that they do not introduce unintended harm to individuals or society. Additionally, effective governance strategies help organizations to build trust in their AI systems, minimize reputational risks, and maximize the potential of AI technologies in their operations.

Perpetuating bias – the network effect

Bias exists in human decision-making, so why is it so bad if algorithms take this bias and reflect it in their decisions?

The answer lies in amplification through the network effect. Think bigot in the cloud!

An unfair society inevitably yields unfair models. As much as we like to think we are fair and free of subconscious judgments, we as humans are prone to negative (and positive) implicit bias, stereotyping, and prejudice. Implicit (unconscious) bias is not intentional, but it can still impact how we judge others based on a variety of factors, including gender, race, religion, culture, language, and sexual orientation. Now, imagine this as part of a web-based API – a service offered in the spirit of democratization of AI – on a popular machine learning acceleration platform to speed up development, with this bias proliferated across multiple geographies and demographics! Bias in the cloud is a serious concern.

Figure 1.2: A list of implicit biases

Blaming this unfairness on society is one way to handle this (albeit not a very good one!) but considering the risk of perpetuating biases in algorithms that may outlive us all, we must strive to eliminate these biases without compromising prediction accuracy. By examining today’s data on Fortune 100 CEOs’ profiles, we can see that merely reinforcing biases based on features such as gender and race could lead to erroneous judgments, overlooked talent, and potential discrimination. For instance, if we have historically declined loans to minorities and people of color, using a dataset built on these prejudiced and bigoted practices to create a model will only serve to reinforce and perpetuate such unfair behavior.

On top of that, we miss a great opportunity – to address our own biases before we codify that in perpetuity.

The problem with delegating our decisions to machines with our biases intact is that it would lead to having these algorithms perpetuate the notion of gender, affinity, attribution, conformity, confirmation, and a halo and horn effect, and affirmation leads to reinforcing our collective stereotypes. Today, when algorithms act as the first line of triage, minorities have to “whiten” job résumés (see Minorities Who “Whiten” Job Resumes Get More Interviews – Harvard Business Review 14) to get more interviews. Breaking this cycle of bias amplification and correcting the network effect in a fair and ethical manner is one of the greatest challenges of our digital times.

Transparency versus black-box apologetics – advocating for AI explainability

We like to think transparency and interpretability are good – it seems very logical to assume that if we can understand what algorithms are doing, it helps us troubleshoot, debug, measure, improve, and build upon them easily. With all the virtues described previously, you would imagine interpretability is a no-brainer. Surprise! It is not without its critics. Explainable and uninterpretable AI are two opposing viewpoints in the field of AI. Proponents of explainable AI argue that it enhances transparency, trustworthiness, and regulatory compliance. In contrast, supporters of uninterpretable AI maintain that it can lead to better performance in complex and opaque systems, while also protecting intellectual property. It’s interesting to see how not everyone is a big fan of it, including some of the greatest minds of our times, such as Turing Award winners Yoshua Bengio and Yann LeCun.

This important argument was the centerpiece in the first-ever great debate 15 at a NeurIPS conference, where Rich Caruana and Patrice Simard argued in favor of it, while Kilian Weinberger and Yann LeCun were against it. The debate reflects the ongoing discussion in the machine learning community regarding the trade-off between performance and interpretability.

Researchers and practitioners who consider black-box AI models as acceptable often emphasize the performance benefits of these models, which have demonstrated state-of-the-art results in various complex tasks. They argue that the high accuracy achieved by black-box models can outweigh the need for interpretability, particularly when solving intricate problems. Proponents also contend that real-world complexity necessitates embracing the intricacy of black-box models to capture the nuances of the problem at hand. They assert that domain experts can still validate the model’s output and use their expertise to determine whether the model’s predictions are reasonable, even if the model itself is not fully interpretable.

Conversely, critics tell the joke, “Why did the black-box AI cross the road? Nobody knows, as it won’t explain itself!”

But seriously, we should emphasize the importance of ethics and fairness, as a lack of interpretability may lead to unintended biases and discrimination, undermining trust in the AI system. We should also stress the importance of accountability and transparency, as it is crucial for users and stakeholders to understand the decision-making process and factors influencing a model’s output. We would like to argue that model interpretability is vital to debug and improve models, as identifying and correcting issues in black-box models can be challenging. Regulatory compliance often requires a level of interpretability to ensure that AI systems abide by legal requirements and ethical guidelines, which would be virtually impossible if a model couldn’t explain itself.

In a Wired interview titled Google’s AI Guru Wants Computers to Think More Like Brains 16, Turing Award winner and father of modern neural networks, Geoff Hinton stated the following:

“I’m an expert on trying to get the technology to work, not an expert on social policy. One place where I do have technical expertise that’s relevant is [whether] regulators should insist that you can explain how your AI system works. I think that would be a complete disaster.”

This is a fairly strong statement that was met with a rebuttal in an article 17 in which the counterargument focused on what was best for humanity and what it means for society. The way we see it, there is room for both. In In defense of blackbox models, Holm 18 states the following:

“...we cannot use blackbox AI to find causation, systemization, or understanding and these questions remain in purview of human intelligence. On the contrary, blackbox methods can contribute substantively and productively to science, technology, engineering, and math.”

For most practitioners, the goal is to strike a balance between transparency and performance that satisfies the needs of various stakeholders, including users, regulators, and developers. The debate continues, with different researchers offering diverse perspectives based on their fields of expertise and research focus.

As professionals in the field of machine learning, we emphasize the importance of transparent, interpretable, and explainable outcomes to ensure their reliability. Consequently, we are hesitant to rely on “black-box” models that offer no insight into their decision-making processes. Although some argue that accuracy and performance are sufficient to establish trust in AI systems, we maintain that interpretability is crucial. We recognize the ongoing debate regarding the role of interpretability in machine learning, but it is essential to note that our position favors interpretability over a singular focus on outcomes – your mileage may vary (YMMV) 19.

The AI alignment problem

The AI alignment problem has become increasingly relevant in recent years due to the rapid advancements in AI and its growing influence on various aspects of society. This problem refers to the challenge of designing AI systems that align with human values, goals, and ethics, ensuring that these systems act in the best interests of humanity.

One reason for the increasing popularity of the AI alignment problem is the potential for AI systems to make high-stakes decisions, which may involve trade-offs and ethical dilemmas. A classic example is the trolley problem, where an AI-controlled vehicle must choose between two undesirable outcomes, such as saving a group of pedestrians at the cost of harming its passengers. This ethical dilemma highlights the complexity of aligning AI systems with human values and raises questions about the responsibility and accountability of AI-driven decisions.

In addition to this, there are a few other significant challenges to AI alignment – containment and the do anything now (DAN) problem. The containment problem refers to the challenge of ensuring that an AI system does not cause unintended harm or escape from its intended environment. This problem is particularly important when dealing with AI systems that have the potential to cause significant harm, such as military or medical AI systems. The DAN problem, on the other hand, refers to the challenge of ensuring that an AI system does not take actions that are harmful to humans or society, even if those actions align with the system’s goals. For example, the paperclip problem is a thought experiment that illustrates this problem.

In this scenario, an AI system is designed to maximize the production of paperclips. The system becomes so focused on this goal that it converts all matter on Earth, including humans, into paperclips. The reward hacking problem occurs when an AI system finds a way to achieve its goals that does not align with human values. The corrigibility problem relates to ensuring that an AI system can be modified or shut down if it becomes harmful or deviates from its intended behavior. This superintelligence control problem involves ensuring that advanced AI systems with the potential for superintelligence are aligned with human values and can be controlled if they become a threat.

Addressing these challenges and other AI alignment-related problems is crucial to ensure the safe and responsible development of AI systems, promote their beneficial applications, and prevent unintended harm to individuals and society.

Summary

This chapter provided an overview of the importance of developing appropriate governance frameworks for AI. The issue of automating bias in AI is a critical concern that requires urgent attention. Without appropriate governance frameworks, we risk exacerbating these problems and perpetuating societal inequalities. In this chapter, we outlined key terminologies such as explainability, interpretability, fairness, explicability, safety, trustworthiness, and ethics that play an important role in developing effective AI governance frameworks. Developing effective governance frameworks requires a comprehensive understanding of these concepts and their interplay.

We also explored the issue of automating bias and how the network effect can exacerbate these problems. The chapter highlighted the need for explainability and offers a critique of “black-box apologetics,” which suggests that AI models should not be interpretable. Ultimately, the chapter makes a strong case for the importance of AI governance and the need to ensure that AI is developed and deployed in an ethical and responsible manner. This is crucial to build trust in AI and ensure that its impacts are aligned with our societal goals and values.

The next chapter is upon us, like a towel in the hands of a galactic hitchhiker, always ready for the next adventure.

References and further reading

https://fairmlbook.org/tutorial2.html
https://fairmlbook.org/tutorial2.html
Nonfunctional requirements verb: https://en.wikipedia.org/wiki/Listofsystemqualityattributes
https://www.Merriam-webster.com/thesaurus/explainable
Ethics guidelines for trustworthy AI. The umbrella term implies that the decision-making process of AI systems must be transparent, and the capabilities and purpose of the systems must be openly communicated to those affected. Even though it may not always be possible to provide an explanation for why a model generated a particular output or decision, efforts must be made to make the decision-making process as clear as possible. When the decision-making process of a model is not transparent, it is referred to as a “black box” algorithm and requires special attention. In these cases, other measures such as traceability, auditability, and transparent communication on system capabilities may be required.
Even though the terms might sound similar, explicability refers to a broader concept of transparency, communication, and understanding in machine learning, while explainability is specifically focused on the ability to provide clear and understandable explanations for how a model makes its decisions. While explainability is a specific aspect of explicability, explicability encompasses a wider range of measures to ensure the decision-making process of a machine learning model is understood and trusted.
Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images: https://arxiv.org/abs/1412.1897
https://www.youtube.com/watch?v=93Xv8vJ2acI
https://fairmlbook.org/tutorial2.html
https://fairmlbook.org/tutorial2.html
https://blogs.partner.microsoft.com/mpn/shared-responsibility-ai-2/
https://ec.europa.eu/digital-single-market/en/news/ethics-guidelines-trustworthy-ai
https://en.oxforddictionaries.com/definition/ethics
https://hbswk.hbs.edu/item/minorities-who-whiten-job-resumes-get-more-interviews
Interpretability is necessary for Machine Learning: https://www.youtube.com/watch?v=93Xv8vJ2acI
https://www.wired.com/story/googles-ai-guru-computers-think-more-like-brains/
Geoff Hinton Dismissed The Need For Explainable AI: Experts Explain Why He’s Wrong: https://www.forbes.com/sites/cognitiveworld/2018/12/20/geoff-hinton-dismissed-the-need-for-explainable-ai-8-experts-explain-why-hes-wrong
In defense of the black box: https://pubmed.ncbi.nlm.nih.gov/30948538/
https://dictionary.cambridge.org/us/dictionary/english/ymmv
Interpretability is necessary for Machine Learning: https://www.youtube.com/watch?v=93Xv8vJ2acI
Interpretable Machine Learning by Christoph Molnar: https://christophm.github.io/interpretable-ml-book/
Explainable AI: Interpreting, Explaining and Visualizing Deep Learning by Wojciech Samek, et al: https://books.google.co.in/books?id=j5yuDwAAQBAJ
Fairness and Machine Learning by Matt Kusner, et al: https://fairmlbook.org/
The Ethics of AI by Nick Bostrom and Eliezer Yudkowsky: https://intelligence.org/files/EthicsofAI.pdf
Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy by Cathy O’Neil: https://www.goodreads.com/book/show/29981085-weapons-of-math-destruction
Explainable AI (XAI) by Defense Advanced Research Projects Agency (DARPA): https://www.darpa.mil/program/explainable-artificial-intelligence

Responsible AI in the Enterprise: Practical AI risk management for explainable, auditable, and safe models with hyperscalers and Azure OpenAI

What do you get with eBook?

Product Details

Responsible AI in the Enterprise

Explainable and Ethical AI Primer

The imperative of AI governance

Key terminologies

Explainability

Interpretability

Explicability

Safe and trustworthy

Fairness

Ethics

Transparency

Model governance

Enterprise risk management and governance

Tools for enterprise risk governance

AI risk governance in the enterprise

Perpetuating bias – the network effect

Transparency versus black-box apologetics – advocating for AI explainability

The AI alignment problem

Summary

References and further reading

Page 1 of 6

Key benefits

Description

What you will learn

What do you get with eBook?

Product Details

Table of Contents

Recommendations for you

Customer reviews

Filter reviews by

People who bought this also bought

Authors (2)

FAQs

Responsible AI in the Enterprise: Practical AI risk management for explainable, auditable, and safe models with hyperscalers and Azure OpenAI

What do you get with eBook?

Product Details

Key benefits

Description

What you will learn

What do you get with eBook?

Product Details

Packt Subscriptions

Table of Contents

Recommendations for you

Customer reviews

Filter reviews by

People who bought this also bought

Authors (2)

FAQs