Reader small image

You're reading from  Data Engineering with Google Cloud Platform

Product typeBook
Published inMar 2022
Reading LevelBeginner
PublisherPackt
ISBN-139781800561328
Edition1st Edition
Languages
Right arrow
Author (1)
Adi Wijaya
Adi Wijaya
author image
Adi Wijaya

Adi Widjaja is a strategic cloud data engineer at Google. He holds a bachelor's degree in computer science from Binus University and co-founded DataLabs in Indonesia. Currently, he dedicates himself to big data and analytics and has spent a good chunk of his career helping global companies in different industries.
Read more about Adi Wijaya

Right arrow

Building an ephemeral cluster using Dataproc and Cloud Composer

Another option to manage ephemeral clusters is using Cloud Composer. We learned about Airflow in the previous chapter to orchestrate BigQuery data loading. But as we've already learned, Airflow has many operators and one of them is of course Dataproc. 

You should use this approach compared to a workflow template if your jobs are complex, in terms of developing a pipeline that contains many branches, backfilling logic, and dependencies to other services, since workflow templates can't handle these complexities.

In this section, we will use Airflow to create a Dataproc cluster, submit a pyspark job, and delete the cluster when finished.

Check the full code in the GitHub repository:

Link to be updated

To use the Dataproc operators in Airflow, we need to import the operators, like this:

from airflow.providers.google.cloud.operators.dataproc import (
    DataprocCreateClusterOperator...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Data Engineering with Google Cloud Platform
Published in: Mar 2022Publisher: PacktISBN-13: 9781800561328

Author (1)

author image
Adi Wijaya

Adi Widjaja is a strategic cloud data engineer at Google. He holds a bachelor's degree in computer science from Binus University and co-founded DataLabs in Indonesia. Currently, he dedicates himself to big data and analytics and has spent a good chunk of his career helping global companies in different industries.
Read more about Adi Wijaya