Reader small image

You're reading from  Official Google Cloud Certified Professional Cloud Security Engineer Exam Guide

Product typeBook
Published inAug 2023
PublisherPackt
ISBN-139781835468869
Edition1st Edition
Right arrow
Authors (2):
Ankush Chowdhary
Ankush Chowdhary
author image
Ankush Chowdhary

With an unwavering focus on technology spanning over two decades, Ankush remains genuinely dedicated to the ever-evolving realm of cybersecurity. Throughout his career, he has consistently upheld a deep commitment to assisting businesses on their journey towards modernization and embracing the digital age. His guidance has empowered numerous enterprises to prioritize and implement essential cybersecurity measures. He has had the privilege of being invited as a speaker at various global cybersecurity events, where he had the opportunity to share his insights and exert influence on key decision-makers concerning cloud security and policy matters. Driven by an authentic passion for education and mentorship, he derives immense satisfaction from guiding, teaching, and mentoring others within the intricate domain of cybersecurity. The intent behind writing this book has been a modest endeavor to achieve the same purpose.
Read more about Ankush Chowdhary

Prashant Kulkarni
Prashant Kulkarni
author image
Prashant Kulkarni

In his career, Prashant has worked directly with customers, helping them overcome different security challenges in various product areas. These experiences have made him passionate about continuous learning, especially in the fast-changing security landscape. Joining Google 4 years back, he expanded his knowledge of Cloud Security. He is thankful for the support of customers, the infosec community, and his peers that have sharpened his technical skills and improved his ability to explain complex security concepts in a user-friendly way. This book aims to share his experiences and insights, empowering readers to navigate the ever-evolving security landscape with confidence. In his free time, Prashant indulges in his passion for astronomy, marveling at the vastness and beauty of the universe.
Read more about Prashant Kulkarni

View More author details
Right arrow

10

Cloud Data Loss Prevention

In this chapter, we will look at Google Cloud data loss protection products and capabilities. Data Loss Prevention (DLP) is a strategy for detecting and preventing the exposure and exfiltration of sensitive data. Google’s DLP strategy involves a layered approach. In addition to a proper organizational hierarchy, network security, IAM access, and VPC Service Controls (VPC-SC), DLP plays a key role in data protection.

Cloud DLP is quite widely used in data pipelines, especially for data warehouses. Protecting confidential data is one of the critical aspects of data workloads, so Cloud DLP helps customers gain visibility of sensitive data risks across the organization. We will look at several features of Cloud DLP, how to configure the product to do inspection and de-identification, and some best practices. There are few tutorials in the chapter, so try out examples to get a solid understanding.

In this chapter, we will cover the following...

Overview of Cloud DLP

Cloud DLP offers some key features for Google’s customers:

  • BigQuery-based data warehouses can be profiled to detect sensitive data, allowing for automated sensitive data discovery. You can scan through the entire Google Cloud organization or choose folders or projects with the profiler’s flexibility.
  • Over 150 built-in information detectors are available in Cloud DLP. Because DLP is API-based, it can be used to swiftly scan, discover, and classify data from anywhere.
  • Support for Cloud Storage, Datastore, and BigQuery is built in: Cloud DLP comes with built-in support for these storage options.
  • You can calculate the level of risk to your data privacy: quasi-identifiers are data elements or combinations of data that can be linked to a single individual or a small group of people. Cloud DLP gives you the option to examine statistical features such as k-anonymity and l-diversity, allowing you to assess the risk of data re-identification...

DLP architecture options

Cloud DLP primarily is seen in three architecture patterns: the content/streaming and storage methods and a hybrid architecture that combines these two patterns.

Content methods

In this architecture option, the data is streamed to the Cloud DLP APIs for inspection/classification or de-identification/transformation. A synchronous API response is received from Cloud DLP. In this case, the client application is expected to process the response. This architecture is typically seen in data pipelines or call center applications where real-time response is needed.

Figure 10.1 – Content method architecture

Figure 10.1 – Content method architecture

As shown in Figure 10.1, using content inspection, you stream small payloads of data to Cloud DLP along with instructions about what to inspect for. Cloud DLP then inspects the data for sensitive content and personally identifiable information (PII) and returns the results of its scan back to you.

Storage methods

In...

Cloud DLP terminology

Before we jump into defining Cloud DLP inspection templates, let us go over some important terminology that you will see in the templates.

DLP infoTypes

Information types, also known as infoTypes, are sensitive data kinds that Cloud DLP is preconfigured to scan and identify—for instance, US Social Security numbers, credit card numbers, phone numbers, zip codes, and names. Both built-in and custom InfoTypes are supported by Cloud DLP.

There is a detector for each infoType defined in Cloud DLP. To identify what to look for and how to transform findings, Cloud DLP employs infoType detectors in its scan configuration. When showing or reporting scan findings, infoType names are also used. Cloud DLP releases new infoType detectors and groups regularly. Call the Cloud DLP REST API’s infoTypes.list method to receive the most up-to-date list of built-in infoTypes.

Please keep in mind that the built-in infoType detectors aren’t always reliable...

Creating a Cloud DLP inspection template

The first step in using classification capabilities is to create an inspection template. The inspection template will store all the data classification requirements:

  1. In the Cloud console, open Cloud DLP.
  2. From the CREATE menu, choose Template.
Figure 10.4 – Creating a DLP inspection template

Figure 10.4 – Creating a DLP inspection template

  1. Alternatively, click the following button: Create new template.

This page contains the following sections:

  • Define template
  • Configure detection

Defining the template

Under Define template, enter an identifier for the inspection template. This is how you’ll refer to the template when you run a job, create a job trigger, and so on. You can use letters, numbers, and hyphens. If you want, you can also enter a more human-friendly display name, as well as a description to better remember what the template does.

Configuring detection

Next, you configure what Cloud...

Best practices for inspecting sensitive data

There are several things that you need to consider before starting an inspection. We will go over them now:

  • Identify and prioritize scanning: It’s important to identify your resources and specify which have the highest priority for scanning. When just getting started, you may have a large backlog of data that needs classification, and it’ll be impossible to scan it all immediately. Choose data initially that poses the highest risk—for example, data that is frequently accessed, widely accessible, or unknown.
  • Reduce latency: Latency is affected by several factors: the amount of data to scan, the storage repository being scanned, and the type and number of infoTypes that are enabled. To help reduce job latency, you can try the following:
    • Enable sampling.
    • Avoid enabling infoTypes you don’t need. While useful in certain scenarios, some infoTypes—including PERSON_NAME, FEMALE_NAME, MALE_NAME, FIRST_NAME...

Inspecting and de-identifying PII data

To de-identify sensitive data, use Cloud DLP’s content.deidentify method.

There are three parts to a de-identification API call:

  • The data to inspect: A string or table structure (ContentItem object) for the API to inspect.
  • What to inspect for: Detection configuration information (InspectConfig) such as what types of data (or infoTypes) to look for, whether to filter findings that are above a certain likelihood threshold, whether to return no more than a certain number of results, and so on. Not specifying at least one infoType in an InspectConfig argument is equivalent to specifying all built-in infoTypes. Doing so is not recommended, as it can cause decreased performance and increased cost.
  • What to do with the inspection findings: Configuration information (DeidentifyConfig) that defines how you want the sensitive data de-identified. This argument is covered in more detail in the following section.

The API returns...

Tutorial: How to de-identify and tokenize sensitive data

Cloud DLP supports both reversible and non-reversible cryptographic methods. In order to re-identify content, you need to choose a reversible method. The cryptographic method described here is called deterministic encryption using Advanced Encryption Standard in Synthetic Initialization Vector mode (AES-SIV). We recommend this among all the reversible cryptographic methods that Cloud DLP supports because it provides the highest level of security.

In this tutorial, we’re going to see how to generate a key to de-identify sensitive text into a cryptographic token. In order to restore (re-identify) that text, you need the cryptographic key that you used during de-identification and the token.

Before you begin, make sure you have the following roles in your Google Cloud project:

  • Service account admin, to be able to create service accounts
  • Service usage admin, to be able to enable services
  • Security admin...

DLP use cases

You have seen how DLP can be used to inspect, identify, and re-identify sensitive data in your workloads. Now let us understand various use cases to see how DLP fits:

  • Automatically discover sensitive data: With Cloud DLP, you can automatically understand and manage your data risk across your entire enterprise. Continuous data visibility can assist you in making more informed decisions, managing and reducing data risk, and being compliant. Data profiling is simple to set up on the Cloud console, and there are no jobs or overhead to worry about, so you can focus on the results and your business.
  • Classify data across your enterprise: Cloud DLP can help you categorize your data, whether it’s on or off the cloud, and provide the insights you need to ensure correct governance, management, and compliance. Publish summary findings to other services such as Data Catalog, Security Command Center, Cloud Monitoring, and Pub/Sub or save comprehensive findings to...

Best practices for Cloud DLP

It can be difficult to figure out where Cloud DLP fits in your architecture or to identify requirements for Cloud DLP. Here are some best practices for you to understand how to use Cloud DLP in various scenarios:

  • Use data profiling versus inspection jobs: Data profiling allows you to scan BigQuery tables in a scalable and automated manner without the need for orchestrating jobs. Considering the growth of data and the increasing number of tables, leveraging profiling features is recommended as it takes care of orchestration and running inspection jobs behind the scenes without any overhead. The inspection jobs can complement profilers when deeper investigation scans are needed. For example, if there are around 25,000 tables to be scanned, the recommendation is to scan all the tables with a profiler and then do a deep scan of 500 tables to flag sensitive/unstructured data that needs a more exhaustive investigation.
Figure 10.7 – Decision tree for inspection

Figure...

Data exfiltration and VPC Service Controls

In the public cloud, there are several threats that organizations need to understand before deploying critical workloads. Here are a few threats that would lead to data exfiltration:

  • Misconfigured IAM policies
  • Malicious insiders copying data to an unauthorized destination
  • Compromised code copying data to an unauthorized destination
  • Access to data from unauthorized clients using a stolen credential

Here are various paths via which data can be exfiltrated in the cloud:

  • Internet <-> service (stolen credentials)
    • Copy to internet
  • Service <-> service (insider threat)
    • Copy from one storage service to another
  • VPC <-> service (compromised VM)
    • Copy to consumer Google services
    • Copy to public GCS buckets/BigQuery dataset/GCR repo

Google Cloud offers some excellent offerings to stop the exfiltration of data as a part of its data loss prevention portfolio of products. VPC Service Controls extends...

Best practices for VPC Service Controls

Now that you understand the higher-level details of VPC Service Controls perimeters, let us go over some best practices:

  • A single large perimeter is the simplest to implement and reduces the total number of moving parts requiring additional operational overhead, which helps to prevent complexity in your allowlist process.
  • When data sharing is a primary use case for your organization, you can use more than one perimeter. If you produce and share lower-tier data such as de-identified patient health data, you can use a separate perimeter to facilitate sharing with outside entities.
  • When possible, enable all protected services when you create a perimeter, which helps to reduce complexity and reduces potential exfiltration vectors. Make sure that there isn’t a path to the private VIP from any of the VPCs in the perimeter. If you allow a network route to private.googleapis.com, you reduce the VPC Service Controls protection from...

Summary

We have covered a lot of things in this chapter. We went over various DLP definitions, use cases, and architecture options. We went over in detail how to use DLP for inspection and de-identification. We also saw examples of how to call DLP APIs and how to interpret the response. Last but not least, we went over data exfiltration strategies such as VPC Service Controls and their best practices. In the next chapter, we will go over the features of Secret Manager and how it can be used to store your application secrets securely.

Further reading

For more information on Cloud DLP and VPC Service Controls, refer to the following links:

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Official Google Cloud Certified Professional Cloud Security Engineer Exam Guide
Published in: Aug 2023Publisher: PacktISBN-13: 9781835468869
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Authors (2)

author image
Ankush Chowdhary

With an unwavering focus on technology spanning over two decades, Ankush remains genuinely dedicated to the ever-evolving realm of cybersecurity. Throughout his career, he has consistently upheld a deep commitment to assisting businesses on their journey towards modernization and embracing the digital age. His guidance has empowered numerous enterprises to prioritize and implement essential cybersecurity measures. He has had the privilege of being invited as a speaker at various global cybersecurity events, where he had the opportunity to share his insights and exert influence on key decision-makers concerning cloud security and policy matters. Driven by an authentic passion for education and mentorship, he derives immense satisfaction from guiding, teaching, and mentoring others within the intricate domain of cybersecurity. The intent behind writing this book has been a modest endeavor to achieve the same purpose.
Read more about Ankush Chowdhary

author image
Prashant Kulkarni

In his career, Prashant has worked directly with customers, helping them overcome different security challenges in various product areas. These experiences have made him passionate about continuous learning, especially in the fast-changing security landscape. Joining Google 4 years back, he expanded his knowledge of Cloud Security. He is thankful for the support of customers, the infosec community, and his peers that have sharpened his technical skills and improved his ability to explain complex security concepts in a user-friendly way. This book aims to share his experiences and insights, empowering readers to navigate the ever-evolving security landscape with confidence. In his free time, Prashant indulges in his passion for astronomy, marveling at the vastness and beauty of the universe.
Read more about Prashant Kulkarni