Reader small image

You're reading from  Azure Data Engineering Cookbook

Product typeBook
Published inApr 2021
Reading LevelBeginner
PublisherPackt
ISBN-139781800206557
Edition1st Edition
Languages
Right arrow
Author (1)
Ahmad Osama
Ahmad Osama
author image
Ahmad Osama

Ahmad Osama works for Pitney Bowes Pvt. Ltd. as a technical architect and is a former Microsoft Data Platform MVP. In his day job, he works on developing and maintaining high performant, on-premises and cloud SQL Server OLTP environments as well as deployment and automating tasks using PowerShell. When not working, Ahmad blogs at DataPlatformLabs and can be found glued to his Xbox.
Read more about Ahmad Osama

Right arrow

Transforming data using Scala

In this recipe, we'll mount the Azure Data Lake Storage Gen2 filesystem on DBFS. We'll then read the orders data from Data Lake and the customer data from an Azure Synapse SQL pool. We'll apply transformation using Scala, analyze data using SQL, and then insert the aggregated data into an Azure Synapse SQL pool.

Getting ready

To get started, follow these steps:

  1. Log into https://portal.azure.com using your Azure credentials.
  2. You will need an existing Azure Databricks workspace and at least one Databricks cluster. You can create these by following the Configuring an Azure Databricks environment recipe.

How to do it…

Let's get started with provisioning the source and the destination data sources. We'll begin by creating and uploading files to an Azure Data Lake Storage Gen2 account:

  1. Execute the following command to create an Azure Data Lake Storage Gen2 account and upload the necessary files...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Azure Data Engineering Cookbook
Published in: Apr 2021Publisher: PacktISBN-13: 9781800206557

Author (1)

author image
Ahmad Osama

Ahmad Osama works for Pitney Bowes Pvt. Ltd. as a technical architect and is a former Microsoft Data Platform MVP. In his day job, he works on developing and maintaining high performant, on-premises and cloud SQL Server OLTP environments as well as deployment and automating tasks using PowerShell. When not working, Ahmad blogs at DataPlatformLabs and can be found glued to his Xbox.
Read more about Ahmad Osama