Reader small image

You're reading from  Learn Microsoft Fabric

Product typeBook
Published inFeb 2024
Reading LevelN/a
PublisherPackt
ISBN-139781835082287
Edition1st Edition
Languages
Right arrow
Authors (2):
Arshad Ali
Arshad Ali
author image
Arshad Ali

Arshad Ali is a principal product manager at Microsoft, working on the Microsoft Fabric product team in Redmond, WA. He focuses on Spark Runtime, which empowers both data engineering and data science experiences. In his previous role, he helped strategic customers and partners adopt Azure Synapse and Microsoft Fabric. Arshad has more than 20 years of industry experience and has been with Microsoft for over 16 years. He is the co-author of the book Big Data Analytics with Azure HDInsight and the author of over 200 technical articles and blogs on data and analytics. Arshad holds an MBA from the Foster School of Business at the University of Washington and an MCA from India.
Read more about Arshad Ali

Bradley Schacht
Bradley Schacht
author image
Bradley Schacht

Bradley Schacht is a principal program manager on the Microsoft Fabric product team based in Saint Augustine, Florida. Bradley is a former consultant and trainer and has co-authored five books on SQL Server and Power BI. As a member of the Microsoft Fabric product team, Bradley works directly with customers to solve some of their most complex data problems and helps shape the future of Microsoft Fabric. Bradley gives back to the community by speaking at events, such as the PASS Summit, SQL Saturday, Code Camp, and user groups across the country, including locally at the Jacksonville SQL Server User Group (JSSUG). He is a contributor on SQLServerCentral and blogs on his personal site, BradleySchacht.
Read more about Bradley Schacht

View More author details
Right arrow

Monitoring Overview and Monitoring Different Workloads

While Fabric has been designed with built-in auto-optimization for optimal performance, there are times when you would like to look into currently running jobs (for auditing and control) and/or learn to monitor, troubleshoot, and optimize it even further than what is natively provided. In this chapter, you will learn about how to monitor different Fabric workloads, gain an understanding of what’s going on under the hood, and learn how to perform troubleshooting for your jobs.

The topics covered in this chapter are as follows:

  • Overview of monitoring capabilities in Fabric
  • Monitoring Data Factory pipelines and dataflows
  • Monitoring Spark jobs (data engineering and data science)
  • Monitoring data warehouse activity
  • Monitoring Real-Time Analytics activity
  • Monitoring capacity usage with the Capacity Metrics app

With the help of these topics, you will learn about monitoring different aspects...

Technical requirements

This chapter builds on Chapters 3, 4, 5, and 6, and you should have completed these chapters to monitor the execution of the jobs from those chapters.

Overview of monitoring capabilities in Fabric

Monitoring hub in Fabric is a central, single entry point for you to monitor and track every aspect of jobs’ progress from different Fabric workloads. You can launch Monitoring hub from the left pane by clicking on the Monitoring hub icon, as shown in Figure 7.1. By default, it shows logged job execution data (In progress, Completed, Failed, Cancelled, Not started, and so on) for the last seven days; however, you can change the time range based on your needs.

Figure 7.1 – Monitoring hub

Figure 7.1 – Monitoring hub

On the top- right side of the Monitoring hub page, you will find options to choose columns you want to see as part of the table, as shown in Figure 7.2. There are some default columns already included; however, you can include additional columns with additional information to learn more about what’s going on. Likewise, you can also apply filters to the tracked monitoring information. For example, you can...

Monitoring Data Factory pipelines and dataflows

A Data Factory workload in Fabric brings together the ease of use of Power Query and the scale and power of Azure Data Factory for you to build the data integration component of your analytics system. It offers pipelines, which are groupings of one or more activities that are executed together (either serially or parallelly based on how you have designed it) to perform a specific task, and dataflows, which are transformation engines of Microsoft Fabric that use Power Query to deliver a low- to no-code data transformation experience. You can monitor the execution of pipelines and dataflows in Monitoring hub. You can scroll through the list of tracked information or you can use the filter available at the top of the Monitoring hub page to filter tracked information for pipelines or dataflows. As shown in Figure 7.4, you can also use text-based search to filter out information to easily locate the pipeline or dataflow execution instance...

Monitoring Spark jobs (data engineering and data science)

Data engineering and data science workloads are powered by Fabric Spark Runtime (based on Apache Spark). With the flexibility Fabric provides, you can either choose Notebook for interactive development, use a Spark job definition for batch execution, or use REST APIs for programmatically submitting and executing these jobs. In all these cases, these jobs will be executed by Fabric Spark Runtime and telemetry will be captured for you to look into.

You can monitor these jobs while they’re still executing or when they have completed their execution in Monitoring hub. You can scroll through the list of tracked activity or you can use a filter at the top to search logged information of a specific type. Figure 7.7 shows an example where it uses text-based search to search for all the logs for notebook execution. You can hover over each row to get its details or you can click on it to get more granular details for the Spark...

Monitoring data warehouse activity

As of writing this in January 2024, the data warehouse experience does not appear in Fabric's Monitoring hub. Instead, administrators will rely on Dynamic Management Views (DMVs) to gain insight into warehouse and SQL endpoint activity. Combined, the DMVs provide a view into server connections (sys.dm_exec_connections), sessions (sys.dm_exec_sessions), and active queries (sys.dm_exec_requests).

Looking at the data in these can answer a few key questions:

  • Who is connected to the warehouse and from where?
  • What queries are executing right now?
  • Who is executing queries right now?
  • How long have queries been executing?

Using a script available in the Fabric toolbox (https://aka.ms/FabricToolbox), these DMVs can be joined together to provide a look at all the current database activity.

Figure 7.13 – An example output from the Fabric SQL DMVs

Figure 7.13 – An example output from the Fabric SQL DMVs

DMVs provide valuable information but only...

Monitoring Real-Time Analytics activity

With Fabric Real-Time Analytics, there are two main areas that need to be monitored: eventstreams and KQL database activity. While there is no centralized report to see activity across all eventstreams and KQL databases inside a workspace, the relevant information can be found by navigating to each item individually. Because the experience is slightly different for each of these two items, let’s look at them individually.

Monitoring eventstreams

Recall from Chapter 5, Building an End-to-End Analytics System – Real-Time Analytics, that an eventstream captures, transforms, and routes events to destinations such as KQL databases or Fabric lakehouses. There are two options for monitoring the health and performance of an eventstream, and both are available from the eventstream editor.

The first option, Data insights, displays a variety of metrics related to performance and health. This chart allows you to easily determine if...

Monitoring capacity usage with the Microsoft Fabric Capacity Metrics app

Microsoft Fabric is based on a unified business model that uses the capacity units (distinct pool of resources allocated) across all its engines or capabilities and simplifies your whole experience of how you purchase and use computing resources.

Further, Microsoft Fabric provides a Capacity Metrics app, which you can use to monitor capacity units and their usage across different engines/workloads/capabilities.

Follow the instructions provided at https://learn.microsoft.com/en-us/fabric/enterprise/metrics-app-install to install the Fabric Capacity Metrics app.

Once you have installed the app, you can click on Apps on the left flyout and click on Microsoft Fabric Capacity Metrics (this will be followed by the date it was installed) as shown in Figure 7.18 to launch it.

Figure 7.18 – Launching the Fabric Capacity Metrics app

Figure 7.18 – Launching the Fabric Capacity Metrics app

For the first time, you will have to connect...

Summary

Microsoft Fabric provides built-in rich capabilities for monitoring every aspect of all the jobs submitted from any of the workloads to track its progress over time and its outcome. With Monitoring hub, it provides a central and single entry point to track all the jobs submitted in one place while still providing the ability and flexibility to contextually monitor an individual item while developing or working on it, such as a pipeline or notebook.

In this chapter, we learned about different ways of monitoring Data Factory, Spark jobs, data warehouse queries, Real-Time Analytics, and capacity usage in Fabric to troubleshoot any performance issues and optimize its performance. In the next chapter, we will get into the details of Fabric administration.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Learn Microsoft Fabric
Published in: Feb 2024Publisher: PacktISBN-13: 9781835082287
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Arshad Ali

Arshad Ali is a principal product manager at Microsoft, working on the Microsoft Fabric product team in Redmond, WA. He focuses on Spark Runtime, which empowers both data engineering and data science experiences. In his previous role, he helped strategic customers and partners adopt Azure Synapse and Microsoft Fabric. Arshad has more than 20 years of industry experience and has been with Microsoft for over 16 years. He is the co-author of the book Big Data Analytics with Azure HDInsight and the author of over 200 technical articles and blogs on data and analytics. Arshad holds an MBA from the Foster School of Business at the University of Washington and an MCA from India.
Read more about Arshad Ali

author image
Bradley Schacht

Bradley Schacht is a principal program manager on the Microsoft Fabric product team based in Saint Augustine, Florida. Bradley is a former consultant and trainer and has co-authored five books on SQL Server and Power BI. As a member of the Microsoft Fabric product team, Bradley works directly with customers to solve some of their most complex data problems and helps shape the future of Microsoft Fabric. Bradley gives back to the community by speaking at events, such as the PASS Summit, SQL Saturday, Code Camp, and user groups across the country, including locally at the Jacksonville SQL Server User Group (JSSUG). He is a contributor on SQLServerCentral and blogs on his personal site, BradleySchacht.
Read more about Bradley Schacht