Reader small image

You're reading from  The Self-Taught Cloud Computing Engineer

Product typeBook
Published inSep 2023
PublisherPackt
ISBN-139781805123705
Edition1st Edition
Right arrow
Author (1)
Dr. Logan Song
Dr. Logan Song
author image
Dr. Logan Song

Dr. Logan Song is the enterprise cloud director and chief cloud architect at Dito. With 25+ years of professional experience, Dr. Song is highly skilled in enterprise information technologies, specializing in cloud computing and machine learning. He is a Google Cloud-certified professional solution architect and machine learning engineer, an AWS-certified professional solution architect and machine learning specialist, and a Microsoft-certified Azure solution architect expert. Dr. Song holds a Ph.D. in industrial engineering, an MS in computer science, and an ME in management engineering. Currently, he is also an adjunct professor at the University of Texas at Dallas, teaching cloud computing and machine learning courses.
Read more about Dr. Logan Song

Right arrow

Google Cloud’s Database and Big Data Services

Data plays a very important role in modern industry. Just as petroleum oil has been a primary energy resource for almost all industries, data has become a primary digital asset in modern companies. As more and more of our lives are lived online, more and more businesses are conducted online, and our interactions with technology generate more and more data, companies have realized the power of data in making informed business decisions, improving products and services, and ultimately adding value to businesses.

As a data company, Google provides varied data services on its cloud platform, from data ingestion, data storing, data processing, to data visualization. In this chapter, we will focus on the following topics:

  • Google Cloud’s database services, including Cloud SQL, Cloud Spanner, Cloud Firestore, Cloud Bigtable, and Cloud Memorystore
  • Google Cloud’s big data services, including Cloud Pub/Sub, Cloud...

Google Cloud database services

We have discussed Amazon’s cloud database services, mainly the AWS RDS, and DynamoDB. Now switching to Google Cloud, we will also focus on the relational Cloud SQL service and the NoSQL Cloud Firestore service.

Google Cloud SQL

Google Cloud SQL is a fully managed service that allows users to create, manage, and administer relational databases in Google Cloud. It is a MySQL- and PostgreSQL-compatible database service and includes key managed database features such as automated backups, data replication, flexible pricing, and integration with other GCP services. Cloud SQL provides strong security measures such as encryption at rest and in transit, and role-based access control. It is a powerful tool for managing and storing data in Google Cloud. To make it simple, here we will demonstrate how to use GCP Cloud Shell to create a MySQL database, connect to it, and use it:

  1. Launching Google Cloud Shell and creating a Cloud SQL instance:
  2. ...

Google Cloud’s big data services

Google provides a suite of big data analytics tools and services for users to use to collect, process, and analyze large amounts of data in the cloud. Some key services include the following:

  • Pub/Sub: A cloud messaging service that allows applications to exchange messages reliably, quickly, and asynchronously
  • Dataflow: This is a fully managed, serverless data processing service that enables users to create data pipelines for real-time and batch processing
  • BigQuery: This is a fully managed, serverless data warehouse that enables users to store and analyze massive amounts of structured and unstructured data
  • Cloud Dataproc: This is a managed Hadoop and Spark service that enables users to process large-scale datasets in a scalable and cost-effective manner
  • Looker: A data visualization tool that allows users to create and share interactive dashboards and reports

Google Cloud’s big data services are used for a...

Summary

In this chapter, we learned about Google Cloud’s database and big data services. We have explored Google Cloud SQL, Cloud Datastore, Pub/Sub, Dataflow, and BigQuery with two sample business use cases and hands-on labs. By the end of this chapter, you will have acquired skills in creating and managing databases, ingesting and processing large-scale datasets, and performing data analysis using Google Cloud’s big data services.

In the next chapter, we will examine Google Cloud’s machine learning services.

Practice questions

Questions 1 to 4 are based on the data pipeline shown in Figure 9.22. The pipeline has the default configurations and the following resources:

  • Pub topic = t1, and subscription = s1
  • Dataflow job = df1, with a GCS bucket called b1
  • BigQuery dataset = ds1, and table = ds1-table
Figure 9.22 – GCP data pipeline

Figure 9.22 – GCP data pipeline

1. Which of the following is not part of df1’s metrics?

A. Latency

B. CPU

C. Memory

D. Storage

2. What machine types will be used by df1’s workers?

A. n1-standard

B. f1-micro

C. e2-medium

D. g1-small

3. When defining BigQuery table names, what’s your recommendation?

A. Use delimited identifiers

B. Use different versions of SQL

C. It doesn’t matter since you can change the table name on the fly

D. Use something related to the pipeline

4. We need to update df1 without losing any existing data. What’s your recommendation?

A. Update...

Answers to the practice questions

  1. D
  2. A
  3. A
  4. A
  5. A
  6. B
  7. C
  8. A
lock icon
The rest of the chapter is locked
You have been reading a chapter from
The Self-Taught Cloud Computing Engineer
Published in: Sep 2023Publisher: PacktISBN-13: 9781805123705
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Author (1)

author image
Dr. Logan Song

Dr. Logan Song is the enterprise cloud director and chief cloud architect at Dito. With 25+ years of professional experience, Dr. Song is highly skilled in enterprise information technologies, specializing in cloud computing and machine learning. He is a Google Cloud-certified professional solution architect and machine learning engineer, an AWS-certified professional solution architect and machine learning specialist, and a Microsoft-certified Azure solution architect expert. Dr. Song holds a Ph.D. in industrial engineering, an MS in computer science, and an ME in management engineering. Currently, he is also an adjunct professor at the University of Texas at Dallas, teaching cloud computing and machine learning courses.
Read more about Dr. Logan Song