Reader small image

You're reading from  Machine Learning Engineering on AWS

Product typeBook
Published inOct 2022
PublisherPackt
ISBN-139781803247595
Edition1st Edition
Tools
Right arrow
Author (1)
Joshua Arvin Lat
Joshua Arvin Lat
author image
Joshua Arvin Lat

Joshua Arvin Lat is the Chief Technology Officer (CTO) of NuWorks Interactive Labs, Inc. He previously served as the CTO for three Australian-owned companies and as director of software development and engineering for multiple e-commerce start-ups in the past. Years ago, he and his team won first place in a global cybersecurity competition with their published research paper. He is also an AWS Machine Learning Hero and has shared his knowledge at several international conferences, discussing practical strategies on machine learning, engineering, security, and management.
Read more about Joshua Arvin Lat

Right arrow

Security, Governance, and Compliance Strategies

In the first eight chapters of this book, we focused on getting our machine learning (ML) experiments and deployments working in the cloud. In addition to this, we were able to analyze, clean, and transform several sample datasets using a variety of services. For some of the hands-on examples, we made use of synthetically generated datasets that are relatively safe to work with from a security standpoint (since these datasets do not contain personally identifiable information (PII)). We were able to accomplish a lot of things in the previous chapters, but it is important to note that getting the data engineering and ML engineering workloads running in our AWS account is just the first step! Once we need to work on production-level ML requirements, we have to worry about other challenges concerning the security, governance, and compliance of the ML systems and processes. To solve these challenges, we have to use a variety of solutions...

Managing the security and compliance of ML environments

Data science teams generally spend a big portion of their time processing the data, training the ML model, and deploying the model to an inference endpoint. Due to the amount of work and research required to succeed in their primary objectives, these teams often deprioritize any “additional work” concerning security and compliance. After a few months of running production-level ML workloads in the cloud, these teams may end up experiencing a variety of security-related issues due to the following reasons:

  • A lack of understanding and awareness of the importance of security, governance, and compliance
  • Poor awareness of the relevant compliance regulations and policies
  • The absence of solid security processes and standards
  • Poor internal tracking and reporting mechanisms

To have a better idea of how to properly manage and handle these issues, we will dive deeper into the following topics in...

Preserving data privacy and model privacy

When dealing with ML and ML engineering requirements, we need to make sure that we protect the training data, along with the parameters of the generated model, from attackers. When given the chance, these malicious actors will perform a variety of attacks to extract the parameters of the trained model or even recover the data used to train the model. This means that PII may be revealed and stolen. If the model parameters are compromised, the attacker may be able to perform inference on their end by recreating the model that your company took months or years to develop. Scary, right? Let’s share a few examples of attacks that can be performed by attackers:

  • Model inversion attack: The attacker attempts to recover the dataset used to train the model.
  • Model extraction attack: The attacker tries to steal the trained model using the prediction output values.
  • Membership inference attack: The attacker attempts to infer if a record...

Establishing ML governance

When working on ML initiatives and requirements, ML governance must be taken into account as early as possible. Companies and teams with poor governance experience both short-term and long-term issues due to the following reasons:

  • The absence of clear and accurate inventory tracking of ML models
  • Limitations concerning model explainability and interpretability
  • The existence of bias in the training data
  • Inconsistencies in the training and inference data distributions
  • The absence of automated experiment lineage tracking processes

How do we deal with these issues and challenges? We can solve and manage these issues by establishing ML governance (the right way) and making sure that the following areas are taken into account:

  • Lineage tracking and reproducibility
  • Model inventory
  • Model validation
  • ML explainability
  • Bias detection
  • Model monitoring
  • Data analysis and data quality reporting
  • Data integrity...

Summary

In this chapter, we discussed a variety of strategies and solutions to manage the overall security, compliance, and governance of ML environments and systems. We started by going through several best practices to improve the security and compliance of ML environments. After that, we discussed relevant techniques on how to preserve data privacy and model privacy. Toward the end of this chapter, we covered different solutions using a variety of AWS services to establish ML governance.

In the next chapter, we will provide a quick introduction to MLOps pipelines and then dive deep into automating ML workflows in AWS using Kubeflow Pipelines.

Further reading

For more information on the topics that were covered in this chapter, feel free to check out the following resources:

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Machine Learning Engineering on AWS
Published in: Oct 2022Publisher: PacktISBN-13: 9781803247595
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Joshua Arvin Lat

Joshua Arvin Lat is the Chief Technology Officer (CTO) of NuWorks Interactive Labs, Inc. He previously served as the CTO for three Australian-owned companies and as director of software development and engineering for multiple e-commerce start-ups in the past. Years ago, he and his team won first place in a global cybersecurity competition with their published research paper. He is also an AWS Machine Learning Hero and has shared his knowledge at several international conferences, discussing practical strategies on machine learning, engineering, security, and management.
Read more about Joshua Arvin Lat