Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Azure Data Factory Cookbook - Second Edition

You're reading from  Azure Data Factory Cookbook - Second Edition

Product type Book
Published in Feb 2024
Publisher Packt
ISBN-13 9781803246598
Pages 532 pages
Edition 2nd Edition
Languages
Authors (4):
Dmitry Foshin Dmitry Foshin
Profile icon Dmitry Foshin
Tonya Chernyshova Tonya Chernyshova
Profile icon Tonya Chernyshova
Dmitry Anoshin Dmitry Anoshin
Profile icon Dmitry Anoshin
Xenia Ireton Xenia Ireton
Profile icon Xenia Ireton
View More author details

Table of Contents (15) Chapters

Preface Getting Started with ADF Orchestration and Control Flow Setting Up Synapse Analytics Working with Data Lake and Spark Pools Working with Big Data and Databricks Data Migration – Azure Data Factory and Other Cloud Services Extending Azure Data Factory with Logic Apps and Azure Functions Microsoft Fabric and Power BI, Azure ML, and Cognitive Services Managing Deployment Processes with Azure DevOps Monitoring and Troubleshooting Data Pipelines Working with Azure Data Explorer The Best Practices of Working with ADF Other Books You May Enjoy
Index

Working with Azure Data Explorer

In this chapter, we delve into two key services within the Azure Data platform: Azure Data Factory (ADF) and Azure Data Explorer (ADX). ADX is a fast and highly scalable data exploration service for log and telemetry data offered by Microsoft Azure. It provides real-time analysis capabilities, helping organizations to rapidly ingest, store, analyze, and visualize vast amounts of data. The use cases for ADX span across a range of sectors and applications, from IoT solutions to user behavior analytics and application monitoring, all with the goal of turning raw data into actionable insights.

In the context of ADX, ADF can be used to automate the process of data ingestion from various sources into ADX, perform the necessary transformations, and manage data workflows.

We will cover the following recipes:

  • An introduction to ADX, its architecture, and key features
  • Overview of common use cases for ADX and ADF
  • Setting up a data...

Introduction to ADX

ADX is one of the many managed services (SaaS) offered by Azure. It represents a highly effective, fully managed platform for big data analytics, empowering users with the ability to analyze extensive data volumes in close-to-real-time scenarios. Equipped with a comprehensive suite of tools, ADX facilitates end-to-end solutions for data ingestion, querying, visualization, and management.

Since ADX is a cloud product, we don’t need to spend much time deep-diving into the ADX architecture. The following diagram shows the high-level architecture of the product:

Figure 11.1: ADX solution architecture

The ADX platform simplifies the process of extracting critical insights, identifying patterns and trends, and building forecasting models by analyzing structured, semi-structured, and unstructured data across time series and using machine learning. As a secure, scalable, robust, and ready-for-enterprise service, ADX is well suited for log analytics...

Creating an ADX cluster and ingesting data

In this recipe, we will create an ADX instance in the Azure portal and ingest the sample Storm Events data using the ADX UI.

Getting ready

In this recipe, we will create an ADX cluster; we will be using these in future recipes for Data Factory pipelines.

How to do it...

  1. Go to the Azure portal at https://ms.portal.azure.com/.
  2. Find Azure Data Explorer Clusters.
  3. Click Create.
  4. In the first tab, we need to fill in the Subscription, Resource group, Cluster name, Region, and Workload fields:

    Figure 11.3: Creating an ADX cluster

    Microsoft provides us a cheap workload option for testing and prototyping – Dev/test. Running production workloads in ADX is expensive and will easily burn all your free credits.

  1. Next, click on Review + create and, after some quick validation, we can click Create.
  2. When the creation is complete, click on Go to Resource and it will open the ADX...

Orchestrating ADX data with Azure Data Factory

In the recipe, we highlight the orchestration capabilities of ADF in conjunction with ADX. The focus is on the ADX Command activity within ADF, which facilitates the direct execution of ADX commands.

We will begin by querying the Storm Events table in ADX to extract specific records where direct deaths exceed 5 and place them into the new table, DeadlyStorms. Following this extraction, the recipe outlines the steps to ingest this dataset into a new table within ADX using ADF.

By the end of this recipe, you will have a clear understanding of how to orchestrate data processes and manage data movement within ADX using Azure Data Factory, enhancing your capability to perform intricate data operations within Azure.

Getting ready

We will continue to use our existing ADX cluster and will need to use the service principal that we created in Chapter 1, Getting Started with ADF – or you can use any other service principal...

Ingesting data from Azure Blob storage to ADX in Azure Data Factory using the Copy activity

In this recipe, we explore the versatility of ADF in its interactions with ADX. Azure Data Factory can both extract data from and ingest data into ADX, serving the dual roles of source and sink. This flexibility ensures that data flow and transformation between various Azure services are seamless and efficient.

While there are multiple ways to work with data in this context, our focus in this recipe will be on the Copy command within ADF. Utilizing this command, we’ll guide you through the process of ingesting data from Azure Blob storage directly into ADX. This method simplifies the data migration process, ensuring that your information is rapidly available in ADX for analysis and exploration.

Getting ready

For this recipe, we need an existing Azure Storage account and an ADX cluster. We will add one more file to the storage account and use it as a source file that we will...

lock icon The rest of the chapter is locked
You have been reading a chapter from
Azure Data Factory Cookbook - Second Edition
Published in: Feb 2024 Publisher: Packt ISBN-13: 9781803246598
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}