Reader small image

You're reading from  The Machine Learning Solutions Architect Handbook - Second Edition

Product typeBook
Published inApr 2024
PublisherPackt
ISBN-139781805122500
Edition2nd Edition
Right arrow
Author (1)
David Ping
David Ping
author image
David Ping

David Ping is an accomplished author and industry expert with over 28 years of experience in the field of data science and technology. He currently serves as the leader of a team of highly skilled data scientists and AI/ML solutions architects at AWS. In this role, he assists organizations worldwide in designing and implementing impactful AI/ML solutions to drive business success. David's extensive expertise spans a range of technical domains, including data science, ML solution and platform design, data management, AI risk, and AI governance. Prior to joining AWS, David held positions in renowned organizations such as JPMorgan, Credit Suisse, and Intel Corporation, where he contributed to the advancements of science and technology through engineering and leadership roles. With his wealth of experience and diverse skill set, David brings a unique perspective and invaluable insights to the field of AI/ML.
Read more about David Ping

Right arrow

Bias, Explainability, Privacy, and Adversarial Attacks

In the previous chapter, we explored the topic of AI risk management framework and discussed its importance in mitigating the risks associated with AI systems. We covered the core concepts of what it is, the importance of identifying and assessing risks, and recommendations for managing those risks. In this chapter, we will take a more in-depth look at several specific risk topics and technical techniques for mitigations. We will explore the essential areas of bias, explainability, privacy, and adversarial attacks, and how they relate to AI systems. These are some of the most pertinent areas in responsible AI practices, and it is important for ML practitioners to develop a foundational understanding of these topics and the technical solutions. Specifically, we will examine how bias can lead to unfair and discriminatory outcomes, and how explainability can enhance the transparency and accountability of AI systems. We will also...

Understanding bias

Detecting and mitigating bias is a crucial focus area for AI risk management. The presence of bias in ML models can expose an organization to potential legal risks but also lead to negative publicity, causing reputational damage and public relations issues. Specific laws and regulations, such as the Equal Credit Opportunity Act, also prohibit discrimination in business transactions, like credit transactions, based on race, skin color, religion, sex, nationality origin, marital status, and age. Some other examples of laws against discrimination include the Civil Rights Act of 1964 and Age Discrimination in Employment Act of 1967.

ML bias can result from the underlying prejudice in data. Since ML models are trained using data, if the data has a bias, then the trained model will also exhibit bias behaviors. For example, if you build an ML model to predict the loan default rate as part of the loan application review process, and you use race as one of the features...

Understanding ML explainability

There are two main concepts when it comes to explaining the behaviors of an ML model:

  • Global explainability: This is the overall behavior of a model across all data points used for model training and/or prediction. This helps to understand collectively how different input features affect the outcome of model predictions. For example, after training an ML model for credit scoring, it is determined that income is the most important feature in predicting high credit scores across data points for all loan applicants.
  • Local explainability: This is the behavior of a model for a single data point (instance), and which features had the most influence on the prediction for a single data point. For example, when you try to explain which features influenced the decision the most for a single loan applicant, it might turn out that education was the most important feature, even though income was the most important feature at the global level.
  • ...

Understanding security and privacy-preserving ML

ML models often rely on vast amounts of data, including potentially sensitive information about individuals, such as personal details, financial records, medical histories, or browsing behavior. The improper handling or exposure of this data can lead to serious privacy breaches, putting individuals at risk of discrimination, identity theft, or other harmful consequences. To ensure compliance with data privacy regulations or even internal data privacy controls, ML systems need to provide foundational infrastructure security features such as data encryption, network isolation, compute isolation, and private connectivity. With a SageMaker-based ML platform, you can enable the following key security controls:

  • Private networking: As SageMaker is a fully managed service, it runs in an AWS-owned account. By default, resources in your own AWS account communicate with SageMaker APIs via the public internet. To enable private connectivity...

Understanding adversarial attacks

Adversarial attacks are a type of attack on ML models that exploit their weaknesses and cause them to make incorrect predictions. Imagine you have an ML model that can accurately identify pictures of animals. An adversarial attack might manipulate the input image of an animal in such a way that the model misidentifies it as a different animal.

These attacks work by making small, often imperceptible changes to the input data that the model is processing. These changes are designed to be undetectable by humans but can cause the model to make large errors in its predictions. Adversarial attacks can be used to undermine the performance of ML models in a variety of settings, including image recognition, speech recognition, and natural language processing (NLP). There are two types of adversarial attack objectives: targeted and untargeted. A targeted objective means to make the ML systems predict a specific class determined by the attacker, and an untargeted...

Hands-on lab – detecting bias, explaining models, training privacy-preserving mode, and simulating adversarial attack

Building a comprehensive system for ML governance is a complex initiative. In this hands-on lab, you will learn to use some of SageMaker’s built-in functionalities to support certain aspects of ML governance.

Problem statement

As an ML solutions architect, you have been assigned to identify technology solutions to support a project that has regulatory implications. Specifically, you need to determine the technical approaches for data bias detection, model explainability, and privacy-preserving model training. Follow these steps to get started.

Detecting bias in the training dataset

  1. Launch the SageMaker Studio environment:
    1. Launch the same SageMaker Studio environment that you have been using.
    2. Create a new folder called Chapter13. This will be our working directory for this lab. Create a new Jupyter notebook and...

Summary

This chapter delved deeply into various AI risk topics and techniques, including bias, explainability, privacy, and adversarial attacks. Additionally, you should be familiar with some of the technology capabilities offered by AWS to facilitate model risk management processes, such as detecting bias and model drift. Through the lab section, you gained hands-on experience with utilizing SageMaker to implement bias detection, model explainability, and privacy-preserving model training.

In the next chapter, we will shift our focus to the ML adoption journey and how organizations should think about charting a path to achieve ML maturity.

Join our community on Discord

Join our community’s Discord space for discussions with the author and other readers:

https://packt.link/mlsah

lock icon
The rest of the chapter is locked
You have been reading a chapter from
The Machine Learning Solutions Architect Handbook - Second Edition
Published in: Apr 2024Publisher: PacktISBN-13: 9781805122500
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
David Ping

David Ping is an accomplished author and industry expert with over 28 years of experience in the field of data science and technology. He currently serves as the leader of a team of highly skilled data scientists and AI/ML solutions architects at AWS. In this role, he assists organizations worldwide in designing and implementing impactful AI/ML solutions to drive business success. David's extensive expertise spans a range of technical domains, including data science, ML solution and platform design, data management, AI risk, and AI governance. Prior to joining AWS, David held positions in renowned organizations such as JPMorgan, Credit Suisse, and Intel Corporation, where he contributed to the advancements of science and technology through engineering and leadership roles. With his wealth of experience and diverse skill set, David brings a unique perspective and invaluable insights to the field of AI/ML.
Read more about David Ping