You're reading from Official Google Cloud Certified Professional Cloud Security Engineer Exam Guide

Product typeBook

Published inAug 2023

PublisherPackt

ISBN-139781835468869

Edition1st Edition

Concepts

Information Security

Authors (2):

Ankush Chowdhary

Prashant Kulkarni

View More author details

10

Cloud Data Loss Prevention

In this chapter, we will look at Google Cloud data loss protection products and capabilities. Data Loss Prevention (DLP) is a strategy for detecting and preventing the exposure and exfiltration of sensitive data. Google’s DLP strategy involves a layered approach. In addition to a proper organizational hierarchy, network security, IAM access, and VPC Service Controls (VPC-SC), DLP plays a key role in data protection.

Cloud DLP is quite widely used in data pipelines, especially for data warehouses. Protecting confidential data is one of the critical aspects of data workloads, so Cloud DLP helps customers gain visibility of sensitive data risks across the organization. We will look at several features of Cloud DLP, how to configure the product to do inspection and de-identification, and some best practices. There are few tutorials in the chapter, so try out examples to get a solid understanding.

In this chapter, we will cover the following...

Overview of Cloud DLP

Cloud DLP offers some key features for Google’s customers:

BigQuery-based data warehouses can be profiled to detect sensitive data, allowing for automated sensitive data discovery. You can scan through the entire Google Cloud organization or choose folders or projects with the profiler’s flexibility.
Over 150 built-in information detectors are available in Cloud DLP. Because DLP is API-based, it can be used to swiftly scan, discover, and classify data from anywhere.
Support for Cloud Storage, Datastore, and BigQuery is built in: Cloud DLP comes with built-in support for these storage options.
You can calculate the level of risk to your data privacy: quasi-identifiers are data elements or combinations of data that can be linked to a single individual or a small group of people. Cloud DLP gives you the option to examine statistical features such as k-anonymity and l-diversity, allowing you to assess the risk of data re-identification...

DLP architecture options

Cloud DLP primarily is seen in three architecture patterns: the content/streaming and storage methods and a hybrid architecture that combines these two patterns.

Content methods

In this architecture option, the data is streamed to the Cloud DLP APIs for inspection/classification or de-identification/transformation. A synchronous API response is received from Cloud DLP. In this case, the client application is expected to process the response. This architecture is typically seen in data pipelines or call center applications where real-time response is needed.

Figure 10.1 – Content method architecture

As shown in Figure 10.1, using content inspection, you stream small payloads of data to Cloud DLP along with instructions about what to inspect for. Cloud DLP then inspects the data for sensitive content and personally identifiable information (PII) and returns the results of its scan back to you.

Storage methods

In...

Cloud DLP terminology

Before we jump into defining Cloud DLP inspection templates, let us go over some important terminology that you will see in the templates.

DLP infoTypes

Information types, also known as infoTypes, are sensitive data kinds that Cloud DLP is preconfigured to scan and identify—for instance, US Social Security numbers, credit card numbers, phone numbers, zip codes, and names. Both built-in and custom InfoTypes are supported by Cloud DLP.

There is a detector for each infoType defined in Cloud DLP. To identify what to look for and how to transform findings, Cloud DLP employs infoType detectors in its scan configuration. When showing or reporting scan findings, infoType names are also used. Cloud DLP releases new infoType detectors and groups regularly. Call the Cloud DLP REST API’s infoTypes.list method to receive the most up-to-date list of built-in infoTypes.

Please keep in mind that the built-in infoType detectors aren’t always reliable...

Creating a Cloud DLP inspection template

The first step in using classification capabilities is to create an inspection template. The inspection template will store all the data classification requirements:

In the Cloud console, open Cloud DLP.
From the CREATE menu, choose Template.

Figure 10.4 – Creating a DLP inspection template

Alternatively, click the following button: Create new template.

This page contains the following sections:

Define template
Configure detection

Defining the template

Under Define template, enter an identifier for the inspection template. This is how you’ll refer to the template when you run a job, create a job trigger, and so on. You can use letters, numbers, and hyphens. If you want, you can also enter a more human-friendly display name, as well as a description to better remember what the template does.

Configuring detection

Next, you configure what Cloud...

Best practices for inspecting sensitive data

There are several things that you need to consider before starting an inspection. We will go over them now:

Identify and prioritize scanning: It’s important to identify your resources and specify which have the highest priority for scanning. When just getting started, you may have a large backlog of data that needs classification, and it’ll be impossible to scan it all immediately. Choose data initially that poses the highest risk—for example, data that is frequently accessed, widely accessible, or unknown.
Reduce latency: Latency is affected by several factors: the amount of data to scan, the storage repository being scanned, and the type and number of infoTypes that are enabled. To help reduce job latency, you can try the following:
- Enable sampling.
- Avoid enabling infoTypes you don’t need. While useful in certain scenarios, some infoTypes—including PERSON_NAME, FEMALE_NAME, MALE_NAME, FIRST_NAME...

Inspecting and de-identifying PII data

To de-identify sensitive data, use Cloud DLP’s content.deidentify method.

There are three parts to a de-identification API call:

The data to inspect: A string or table structure (ContentItem object) for the API to inspect.
What to inspect for: Detection configuration information (InspectConfig) such as what types of data (or infoTypes) to look for, whether to filter findings that are above a certain likelihood threshold, whether to return no more than a certain number of results, and so on. Not specifying at least one infoType in an InspectConfig argument is equivalent to specifying all built-in infoTypes. Doing so is not recommended, as it can cause decreased performance and increased cost.
What to do with the inspection findings: Configuration information (DeidentifyConfig) that defines how you want the sensitive data de-identified. This argument is covered in more detail in the following section.

The API returns...

Tutorial: How to de-identify and tokenize sensitive data

Cloud DLP supports both reversible and non-reversible cryptographic methods. In order to re-identify content, you need to choose a reversible method. The cryptographic method described here is called deterministic encryption using Advanced Encryption Standard in Synthetic Initialization Vector mode (AES-SIV). We recommend this among all the reversible cryptographic methods that Cloud DLP supports because it provides the highest level of security.

In this tutorial, we’re going to see how to generate a key to de-identify sensitive text into a cryptographic token. In order to restore (re-identify) that text, you need the cryptographic key that you used during de-identification and the token.

Before you begin, make sure you have the following roles in your Google Cloud project:

Service account admin, to be able to create service accounts
Service usage admin, to be able to enable services
Security admin...

DLP use cases

You have seen how DLP can be used to inspect, identify, and re-identify sensitive data in your workloads. Now let us understand various use cases to see how DLP fits:

Au tomatically discover sensitive data: With Cloud DLP, you can automatically understand and manage your data risk across your entire enterprise. Continuous data visibility can assist you in making more informed decisions, managing and reducing data risk, and being compliant. Data profiling is simple to set up on the Cloud console, and there are no jobs or overhead to worry about, so you can focus on the results and your business.
Classify data across your enterprise: Cloud DLP can help you categorize your data, whether it’s on or off the cloud, and provide the insights you need to ensure correct governance, management, and compliance. Publish summary findings to other services such as Data Catalog, Security Command Center, Cloud Monitoring, and Pub/Sub or save comprehensive findings to...

Best practices for Cloud DLP

It can be difficult to figure out where Cloud DLP fits in your architecture or to identify requirements for Cloud DLP. Here are some best practices for you to understand how to use Cloud DLP in various scenarios:

Use data profiling versus inspection jobs: Data profiling allows you to scan BigQuery tables in a scalable and automated manner without the need for orchestrating jobs. Considering the growth of data and the increasing number of tables, leveraging profiling features is recommended as it takes care of orchestration and running inspection jobs behind the scenes without any overhead. The inspection jobs can complement profilers when deeper investigation scans are needed. For example, if there are around 25,000 tables to be scanned, the recommendation is to scan all the tables with a profiler and then do a deep scan of 500 tables to flag sensitive/unstructured data that needs a more exhaustive investigation.

Figure...

Data exfiltration and VPC Service Controls

In the public cloud, there are several threats that organizations need to understand before deploying critical workloads. Here are a few threats that would lead to data exfiltration:

Misconfigured IAM policies
Malicious insiders copying data to an unauthorized destination
Compromised code copying data to an unauthorized destination
Access to data from unauthorized clients using a stolen credential

Here are various paths via which data can be exfiltrated in the cloud:

Internet <-> service (stolen credentials)
- Copy to internet
Service <-> service (insider threat)
- Copy from one storage service to another
VPC <-> service (compromised VM)
- Copy to consumer Google services
- Copy to public GCS buckets/BigQuery dataset/GCR repo

Google Cloud offers some excellent offerings to stop the exfiltration of data as a part of its data loss prevention portfolio of products. VPC Service Controls extends...

Best practices for VPC Service Controls

Now that you understand the higher-level details of VPC Service Controls perimeters, let us go over some best practices:

A single large perimeter is the simplest to implement and reduces the total number of moving parts requiring additional operational overhead, which helps to prevent complexity in your allowlist process.
When data sharing is a primary use case for your organization, you can use more than one perimeter. If you produce and share lower-tier data such as de-identified patient health data, you can use a separate perimeter to facilitate sharing with outside entities.
When possible, enable all protected services when you create a perimeter, which helps to reduce complexity and reduces potential exfiltration vectors. Make sure that there isn’t a path to the private VIP from any of the VPCs in the perimeter. If you allow a network route to private.googleapis.com, you reduce the VPC Service Controls protection from...

Summary

We have covered a lot of things in this chapter. We went over various DLP definitions, use cases, and architecture options. We went over in detail how to use DLP for inspection and de-identification. We also saw examples of how to call DLP APIs and how to interpret the response. Last but not least, we went over data exfiltration strategies such as VPC Service Controls and their best practices. In the next chapter, we will go over the features of Secret Manager and how it can be used to store your application secrets securely.

With an unwavering focus on technology spanning over two decades, Ankush remains genuinely dedicated to the ever-evolving realm of cybersecurity. Throughout his career, he has consistently upheld a deep commitment to assisting businesses on their journey towards modernization and embracing the digital age. His guidance has empowered numerous enterprises to prioritize and implement essential cybersecurity measures. He has had the privilege of being invited as a speaker at various global cybersecurity events, where he had the opportunity to share his insights and exert influence on key decision-makers concerning cloud security and policy matters. Driven by an authentic passion for education and mentorship, he derives immense satisfaction from guiding, teaching, and mentoring others within the intricate domain of cybersecurity. The intent behind writing this book has been a modest endeavor to achieve the same purpose.
Read more about Ankush Chowdhary

Prashant Kulkarni

In his career, Prashant has worked directly with customers, helping them overcome different security challenges in various product areas. These experiences have made him passionate about continuous learning, especially in the fast-changing security landscape. Joining Google 4 years back, he expanded his knowledge of Cloud Security. He is thankful for the support of customers, the infosec community, and his peers that have sharpened his technical skills and improved his ability to explain complex security concepts in a user-friendly way. This book aims to share his experiences and insights, empowering readers to navigate the ever-evolving security landscape with confidence. In his free time, Prashant indulges in his passion for astronomy, marveling at the vastness and beauty of the universe.
Read more about Prashant Kulkarni

Personalised recommendations for you

Based on your interests and search pattern

Attacking and Exploiting Modern Web Applications

Attacking and Exploiting Modern Web Attacks will help you understand how to identify attack surfaces and detect vulnerabilities. This book takes a hands-on approach to implementation and associated methodologies and equips you with the knowledge and skills needed to effectively combat web attacks.

BookAug 2023338 pages

Automotive Cybersecurity Engineering Handbook

This Automotive Cybersecurity Engineering Handbook untangles the complexities of building secure automotive products and helps you to comply with cybersecurity standards. It provides practical tools, tips, and techniques coupled with real-world examples to enable you to create cyber-resilient automotive products with ease.

BookOct 2023392 pages

Official Google Cloud Certified Professional Cloud Security Engineer Exam Guide

This book will help you to design, develop, and operate security controls on Google Cloud as well as discover best practices for relevant security domains, including identity and access management, network, and data.

BookAug 2023496 pages

Cloud Penetration Testing for Red Teamers

The advent of cloud networks and the AWS, Azure, and GCP platforms has revolutionized how companies of all sizes in all industries do business online. This book will help you meet the emerging demand for pentesting as it guides you through the tools, techniques, and security measures used by pentesters and red teamers in the 2020s and beyond.

BookNov 2023298 pages

ISACA Certified in Risk and Information Systems Control (CRISC®) Exam Guide

ISACA Certified in Risk and Information Systems Control (CRISC®) Certification Guide is an enterprise IT risk management professional’s dream. With its in-depth approach and various self-assessment exercises, this book arms you with knowledge of every single aspect of the certification, and is a fantastic career companion after you’re certified.

BookSep 2023316 pages5

Burp Suite Cookbook

Burp Suite is an immensely powerful and popular tool for web application security testing. This book provides a collection of recipes that address vulnerabilities in web applications and APIs. It offers guidance on how to configure Burp Suite, make the most of its tools, and explore into its extensions.

BookOct 2023450 pages

Building and Automating Penetration Testing Labs in the Cloud

This hands-on guide will help you design and build a variety of penetration testing labs that mimic modern cloud environments running on AWS, Azure, and GCP. In addition to these, you will explore a number of practical strategies on how to manage the complexity, cost, and security risks involved when setting up vulnerable cloud lab environments.

BookOct 2023562 pages

Ethical Hacking Workshop

As cyber-attacks grow and APT groups advance their skillset, you need to be able to protect your enterprise against cyber-attacks. In order to limit your attack surface, you need to ensure that you leverage the same skills and tools that an adversary may use to hack your environment and discover the security gaps. This book will teach you how to think like a hacker, use state-of-the-art hacking tools, and protect yourself and your organizaiton.

BookOct 2023220 pages

Windows Forensics Analyst Field Guide

This book contains step-by-step processes to guide you in any investigation related to Windows OS. You’ll find out how to acquire evidence using multiple tools as well as examine and analyze the collected artifacts, while discovering multiple techniques used in real-world forensic incidents.

BookOct 2023318 pages

Implementing DevSecOps Practices

This book is a comprehensive, hands-on guide for individuals new to DevSecOps who want to implement DevSecOps practices successfully and efficiently. With its help, you’ll be able to shift security toward the left, enabling you to merge security into coding in no time.

BookDec 2023258 pages