Reader small image

You're reading from  Limitless Analytics with Azure Synapse

Product typeBook
Published inJun 2021
Reading LevelBeginner
PublisherPackt
ISBN-139781800205659
Edition1st Edition
Languages
Concepts
Right arrow
Author (1)
Prashant Kumar Mishra
Prashant Kumar Mishra
author image
Prashant Kumar Mishra

Prashant Kumar Mishra is an engineering architect at Microsoft. He has more than 10 years of professional expertise in the Microsoft data and AI segment as a developer, consultant, and architect. He has been focused on Microsoft Azure Cloud technologies for several years now and has helped various customers in their data journey. He prefers to share his knowledge with others to make the data community stronger day by day through his blogs and meetup groups.
Read more about Prashant Kumar Mishra

Right arrow

Chapter 9: Perform Real-Time Analytics on Streaming Data

Azure Synapse has various built-in features that allow us to perform end-to-end analysis on our data. One of the best features is the integration of Azure Synapse with Azure Cosmos DB via Azure Synapse Link. It removes the pain of bringing data from transactional data stores to analytical data stores using an ETL tool. You can read more about this in Chapter 5, Using Synapse Link with Azure Cosmos DB. In this chapter, we are going to use this feature to learn how to perform real-time analytics on streaming data in Azure Synapse. We are also going to learn how to use Azure Stream Analytics jobs to copy streaming data from Event Hubs to Azure Data Lake Storage Gen2. There is also a brief section in this chapter on Azure Databricks. We will create a simple C# application to generate streaming data that will be ingested in a Cosmos DB transactional store, and finally, we will access this data in Synapse through the analytical store...

Technical requirements

Before you start orchestrating your data, here are the prerequisites that you need to meet:

  • You should have access to your Azure subscription or any other subscription with contributor-level access.
  • Create a Synapse workspace using your subscription. You can follow the instructions from Chapter 1, Introduction to Azure Synapse, to create your Synapse workspace.
  • Create a Spark pool and SQL pool on Azure Synapse. This has been covered in Chapter 2, Considerations for Your Compute Environment.
  • You should have already created your Azure Cosmos DB account; make sure you have enabled your analytical store using your Azure Cosmos DB account. To learn more about this, you can refer to Chapter 5, Using Synapse Link with Azure Cosmos DB.
  • Download Power BI Desktop to your machine and make sure you have access to the Power BI workspace, where you can publish your Power BI Desktop file.

Understanding various architecture and components

Azure provides various data services that can be used to perform real-time analytics in different ways. In this section, we will learn about two different architectures and how different components are stitched together in both of these architectures to deliver the end result.

There are various use cases for real-time analytics, including the following:

  • Anomaly detection: This technique is used to identify unusual behavior or patterns that raises suspicions because of a significant difference from the rest of the data.
  • Supply chain analytics: This process is used to increase operational effectiveness by using data and quantitative methods for decision making.
  • Real-time personalization: This technique is used to gather information about the user visiting your website and engage that user by providing tailored content on the website based on their company, location, digital behavior, and so on.

The architecture...

Bringing data to Azure Synapse

In the Understanding the architecture and components section, we saw how architectures use different Azure services to perform real-time analytics on Azure. In this section, our main focus is to bring data from all data sources to Azure Synapse. We are going to learn about bringing data to Azure Synapse by using Azure Stream Analytics jobs, and later we will see how we can use Azure Databricks to copy data to Azure Synapse.

Using Azure Stream Analytics

Azure Stream Analytics is a real-time analytics engine that is designed to process a large volume of streaming data from various sources to various targets. Within Azure Stream Analytics, you can create an Azure Stream Analytics job that consists of an input, a query, and an output. You can use the Stream Analytics job to ingest data directly from the source to the target as is, or you can perform certain aggregation operations on the input data before sending it to the target.

These are the...

Implementation of real-time analytics on streaming data

In this section, we are going to learn about a step-by-step process for implementing real-time analytics using Azure Synapse. We are taking Figure 9.1 as our reference architecture. There are various stages involved in implementing this architecture, and we will go through all these steps in this section. We will learn how to configure all the required resources according to your environment.

Before jumping to the analytics part, we will learn how to ingest data to an Azure Cosmos DB account.

Ingesting data to Cosmos DB

There are various ways to ingest streaming data to Azure Cosmos DB; however, in this section, we are going to use an online application sample to ingested streaming data to the Azure Cosmos DB account.

Follow the instructions to start streaming the data to your Cosmos DB account:

  1. Go to your Azure Cosmos DB account in the Azure portal and click on the Data Explorer tab.
  2. Click on the New...

Summary

In this chapter, we learned how to perform real-time analytics using Azure Synapse. We learned how to bring data to Azure Synapse by using Azure Stream Analytics and Azure Databricks. We also learned how to create a view in a serverless SQL pool and how to use this view to connect to Power BI Desktop for data visualizations. We used a sample application in this chapter to stream data to an Azure Cosmos DB account by using a sample JSON file. You can download and use this application if you want to perform a proof of concept on this particular topic yourself.

In the next chapter, we are going to learn how to use Azure Machine Learning with Azure Synapse. It is important to have prior knowledge of Azure Machine Learning before plunging to the next chapter. You will also learn how to use machine learning models with Azure Synapse SQL and Spark pools.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Limitless Analytics with Azure Synapse
Published in: Jun 2021Publisher: PacktISBN-13: 9781800205659
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Prashant Kumar Mishra

Prashant Kumar Mishra is an engineering architect at Microsoft. He has more than 10 years of professional expertise in the Microsoft data and AI segment as a developer, consultant, and architect. He has been focused on Microsoft Azure Cloud technologies for several years now and has helped various customers in their data journey. He prefers to share his knowledge with others to make the data community stronger day by day through his blogs and meetup groups.
Read more about Prashant Kumar Mishra