Reader small image

You're reading from  Azure Data Engineering Cookbook

Product typeBook
Published inApr 2021
Reading LevelBeginner
PublisherPackt
ISBN-139781800206557
Edition1st Edition
Languages
Right arrow
Author (1)
Ahmad Osama
Ahmad Osama
author image
Ahmad Osama

Ahmad Osama works for Pitney Bowes Pvt. Ltd. as a technical architect and is a former Microsoft Data Platform MVP. In his day job, he works on developing and maintaining high performant, on-premises and cloud SQL Server OLTP environments as well as deployment and automating tasks using PowerShell. When not working, Ahmad blogs at DataPlatformLabs and can be found glued to his Xbox.
Read more about Ahmad Osama

Right arrow

Transforming data using Python

Data transformation at scale is one of the most important uses of Azure Databricks. In this recipe, we'll read product orders from an Azure storage account, read customer information from an Azure SQL Database, join the orders and customer information, apply transformations to filter and aggregate the total order by country and customers, and then insert the output into an Azure SQL Database.

Getting ready

To get started, follow these steps:

  1. Log into https://portal.azure.com using your Azure credentials.
  2. You will need an existing Azure Databricks workspace and at least one Databricks cluster. You can create these by following the Configuring an Azure Databricks environment recipe.

How to do it…

Let's get started by creating an Azure storage account and an Azure SQL database:

  1. Execute the following command to create an Azure Storage account and upload the orders files to the orders/datain container:
    ....
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Azure Data Engineering Cookbook
Published in: Apr 2021Publisher: PacktISBN-13: 9781800206557

Author (1)

author image
Ahmad Osama

Ahmad Osama works for Pitney Bowes Pvt. Ltd. as a technical architect and is a former Microsoft Data Platform MVP. In his day job, he works on developing and maintaining high performant, on-premises and cloud SQL Server OLTP environments as well as deployment and automating tasks using PowerShell. When not working, Ahmad blogs at DataPlatformLabs and can be found glued to his Xbox.
Read more about Ahmad Osama