Reader small image

You're reading from  Data Engineering with Google Cloud Platform - Second Edition

Product typeBook
Published inApr 2024
PublisherPackt
ISBN-139781835080115
Edition2nd Edition
Right arrow
Author (1)
Adi Wijaya
Adi Wijaya
author image
Adi Wijaya

Adi Widjaja is a strategic cloud data engineer at Google. He holds a bachelor's degree in computer science from Binus University and co-founded DataLabs in Indonesia. Currently, he dedicates himself to big data and analytics and has spent a good chunk of his career helping global companies in different industries.
Read more about Adi Wijaya

Right arrow

Understanding the concept of an ephemeral cluster

After running the previous exercises, you may notice that Spark is very useful for processing data, but it has little to no dependence on HDFS. It’s very convenient to use data as is from GCS or BigQuery compared to using HDFS.

What does this mean? It means that we may choose not to store any data in the Hadoop cluster (more specifically, in HDFS) and only use the cluster to run jobs. For cost efficiency, we can smartly turn on and off the cluster only when a job is running.

Furthermore, we can destroy the entire Hadoop cluster when the job is finished and create a new one when we submit a new job. This concept is what’s called an ephemeral cluster.

An ephemeral cluster means the cluster is not permanent. A cluster will only exist when it’s running jobs. There are two main advantages to using this approach:

  • Highly efficient infrastructure cost: With this approach, you don’t need to have a...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Data Engineering with Google Cloud Platform - Second Edition
Published in: Apr 2024Publisher: PacktISBN-13: 9781835080115

Author (1)

author image
Adi Wijaya

Adi Widjaja is a strategic cloud data engineer at Google. He holds a bachelor's degree in computer science from Binus University and co-founded DataLabs in Indonesia. Currently, he dedicates himself to big data and analytics and has spent a good chunk of his career helping global companies in different industries.
Read more about Adi Wijaya