Reader small image

You're reading from  Data Engineering with Google Cloud Platform - Second Edition

Product typeBook
Published inApr 2024
PublisherPackt
ISBN-139781835080115
Edition2nd Edition
Right arrow
Author (1)
Adi Wijaya
Adi Wijaya
author image
Adi Wijaya

Adi Widjaja is a strategic cloud data engineer at Google. He holds a bachelor's degree in computer science from Binus University and co-founded DataLabs in Indonesia. Currently, he dedicates himself to big data and analytics and has spent a good chunk of his career helping global companies in different industries.
Read more about Adi Wijaya

Right arrow

Big Data Capabilities on GCP

When people first dive into Google Cloud Platform (GCP), it’s common for them to find themselves amid a wealth of exciting products and services. Yet, this excitement can occasionally give way to a sense of being overwhelmed.

GCP offers a broad range of services for multiple disciplines, such as application development, microservices, security, AI, and of course, big data. But even for big data products, there are multiple options from which you can choose.

As an analogy, GCP is like a supermarket. A supermarket has everything that you need to support your daily life. For example, if you plan to cook pasta and come to a supermarket to buy the ingredients, no one will tell you what ingredients you should buy; or even if you know which ingredients you want to buy, you will still be offered the ingredients by different brands, price tags, and producers. If you fail to make the right decision, you will end up cooking bad pasta. In GCP, it’...

Technical requirements

In this chapter’s exercise, we will start using the GCP console, Cloud Shell, and Cloud Editor. All these tools can be opened using any internet browser.

To use the GCP console, we need to register using a Google account (Gmail). You’ll also require a payment method. Please check out the available payment methods to make sure you are successfully registered.

Understanding what the cloud is

Renting someone else’s server. This definition of the cloud is my favorite as it’s quite a simple definition of what the cloud really is.

It means that so long as you don’t need to buy your own machine to store and process data, you are using the cloud.

But increasingly, due to some leading cloud providers, such as Google Cloud, gaining more traction and technology maturity, the terminology is becoming representative of sets of architecture, managed services, and highly scalable environments that define how we build solutions.

For data engineering, that means building data products using collections of services and APIs. We trust the underlying infrastructure of the cloud provider one hundred percent.

The difference between the cloud and non-cloud era

If we want to compare the cloud with the non-cloud era from a data engineering perspective, we will find that all the data engineering principles are the same. But from...

Getting started with GCP

OK, let’s get started by taking our first steps and trying out GCP.

Chances are you already registered or created a project in GCP before reading this book. But in case you haven’t, the following steps are mandatory:

  1. Access the GCP console by going to https://console.cloud.google.com in any browser.
  2. Login with your Google account (for example, Gmail).
  3. Register for GCP using your Google account.

At this point, I won’t write many step-by-step instructions since it’s a straightforward registration process. Check out https://cloud.google.com/docs/get-started if you have doubts about any step.

A common question at this point is, do I have to pay for this? The answer is no, but you need to register with a payment method.

When you initially register for GCP, you will be asked for a payment method, but you won’t be charged for anything at that point.

When will you be charged?

You will be charged...

A quick overview of GCP services for data engineering

As shown in the GCP console’s navigation bar, there are a lot of services in GCP. These services are not only limited to data and analytics – there are other areas, such as application development, machine learning, networks, source repositories, and many more. As a data engineer working on GCP, you will face situations where you need to decide which services you need to use for your organization.

You might be wondering, who in an organization should decide on the services to use? Is it the CTO, IT manager, solution architect, or data engineer? The answer depends on their experience of using GCP. But most of the time, data engineers need to be involved in the decision.

So, how should we decide? In my experience, there are three important decision factors:

  • Choose serverless services
  • Understand the mapping between the service and the data engineering areas
  • If there is more than one option in one...

Summary

We’ve learned a lot of new things in this chapter about the cloud and GCP. We started by accessing the GCP console for the first time. Then, we narrowed things down to find our priority GCP services for data engineers to focus on.

We closed this chapter by familiarizing ourselves with notable features and terminologies, such as quotas and service accounts.

In the next chapter, we will start by practicing developing a data warehouse. As we’ve learned in this chapter, the cloud data warehouse in GCP is BigQuery. We will start by learning about this incredibly famous GCP service – and the most important – for GCP data engineers.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Data Engineering with Google Cloud Platform - Second Edition
Published in: Apr 2024Publisher: PacktISBN-13: 9781835080115
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Author (1)

author image
Adi Wijaya

Adi Widjaja is a strategic cloud data engineer at Google. He holds a bachelor's degree in computer science from Binus University and co-founded DataLabs in Indonesia. Currently, he dedicates himself to big data and analytics and has spent a good chunk of his career helping global companies in different industries.
Read more about Adi Wijaya