Reader small image

You're reading from  The Definitive Guide to Power Query (M)

Product typeBook
Published inMar 2024
Reading LevelBeginner
PublisherPackt
ISBN-139781835089729
Edition1st Edition
Languages
Right arrow
Authors (3):
Gregory Deckler
Gregory Deckler
author image
Gregory Deckler

Greg Deckler is a 7-time Microsoft MVP for Data Platform and an active blogger and Power BI community member, having written over 6,000 solutions to community questions. Greg has authored many books on Power BI, including Learn Power BI 1st and 2nd Editions, DAX Cookbook, Power BI Cookbook 2nd Edition and Mastering Power BI 2nd Edition. Greg has also created several external tools for Power BI and regularly posts video content to his YouTube channels, Microsoft Hates Greg and DAX For Humans.
Read more about Gregory Deckler

Rick de Groot
Rick de Groot
author image
Rick de Groot

Rick de Groot was born in the Netherlands and has been working in BI for more than 14 years. He went freelance in 2016 and now works as an independent Power BI consultant. On his mission to make Power BI more accessible, he started two blogs: BI Gorilla and PowerQuery. how, and a YouTube channel sharing Power Query and Power BI content. His commitment to offering free content through multiple platforms has led him to earning the Microsoft Data Platform MVP award for two consecutive years.
Read more about Rick de Groot

Melissa de Korte
Melissa de Korte
author image
Melissa de Korte

Melissa de Korte's approach to facing challenges is fueled by relentless curiosity. She is a dedicated community member and content creator. Her portfolio includes blogs, tutorials, courses, and webinars, that make Power Query M more accessible and useful for all. Behind her professional persona lies a genuine dedication to empowering others through education and knowledge sharing, and a desire to encourage professionals to embrace the potential of Power Query, M.
Read more about Melissa de Korte

View More author details
Right arrow

Optimizing Performance

While returning accurate data is important, the speed of your queries greatly affects the user experience. Long refresh times for your datasets may result in timeouts, and waiting for the results of your transformations in the Power Query editor can be frustrating. In this chapter, we will delve into strategies to optimize the performance of your queries. We will first look at memory usage when evaluating queries and how using too much memory slows down your queries. We will then look at different strategies to prevent this from happening.

We’ll look at the importance of query folding and how you can ensure that the data source processes as many transformations as possible. Then, we’ll cover the formula firewall, a tool designed to protect your queries from exposing sensitive information. We’ll explain what triggers errors related to this and discuss its effect on your queries. Additionally, we’ll examine how storing data in memory...

Understanding memory usage when evaluating queries

Retrieving data and performing transformations does not come for free. To run queries, Power Query requires memory. Some approaches require more memory than others. Understanding how to improve memory usage therefore helps when optimizing queries. So how does that work?

Queries are executed within a mashup container, a dedicated process responsible for query evaluation. You can see the number of mashup containers used by going to the Task Manager and opening the processes for Power BI Desktop:

Figure 15.1: Mashup containers show up in the Task Manager

The memory given to each container depends on where you’re running Power Query. Common environments include Power BI Desktop, the Power BI service, and the on-premises data gateway. So what happens when container memory runs low?

When a mashup container’s memory needs exceed its allocated limit during query evaluation, Power Query begins paging (transferring...

Query folding

Query folding is one of the most important aspects for optimizing your queries’ performance and is useful for any developer who connects to external data sources. It’s a process where the evaluation of a query is offloaded to the data source itself.

During query folding, Power Query’s mashup engine converts a query’s transformations into the data source’s native language. This approach combines multiple transformation steps into one efficient query that the data source can execute.

Doing so transfers the computational load of these transformations from the local environment, where Power Query operates, directly to the data source, like an SQL database. Any remaining transformations that can’t be folded are performed locally.

The primary goal of query folding is to reduce memory usage in Power Query by using the computational power of the data source to process queries and transformations. For instance...

The formula firewall

When working with queries, combining different data sources is common. It’s likely that after some fiddling you run into an error that prevents you from combining your data. This error is the result of the formula firewall that Power Query has, also known as the data privacy firewall. So, what exactly is the formula firewall, and how can you work with it?

What is the formula firewall?

The formula firewall is a Power Query feature that prevents accidental data transfer between sources. This feature is particularly important when dealing with sensitive information. It was designed because of query folding. Query folding allows Power Query to convert data transformations into a data source query that a data source can execute directly.

Imagine a situation where you’re working with two queries in Power Query. Query1 holds sensitive information, like social security numbers. Using an inner join, you aim to use this data to extract relevant...

Optimizing query performance

Power Query operates within a constrained environment. Specifically, each mashup container has a limited amount of resources. This limitation is an important factor to consider when creating queries. It is important to reduce the data volume right at the beginning of your query for fast queries. By doing so, you not only speed up the query process but also prevent exceeding resource limits that cause paging and can lead to slow performance or failures.

As previously discussed, using the query folding mechanism is one of the most effective strategies for optimizing performance. However, there are scenarios where your data source does not support query folding, or a required transformation breaks the folding process. In such cases, the strategy shifts toward minimizing the memory footprint of your query. So, what are some effective methods to achieve this?

Prioritize filtering rows and removing columns

It’s worth emphasizing here that...

Performance tips

Optimizing performance is a complicated topic, even after learning about all the different perspectives in this chapter. As with all optimization topics, you’ll have to test different approaches for your situation. However, we have several general performance tips:

  • Connect your query to a fast data source. If your source system is slow, consider moving the data to a faster source or storing your data in a dataflow.
  • Maximize the number of steps that your query folding mechanisms can fold. If you’re adept at SQL, consider crafting an efficient SQL query and apply the EnableFolding parameter. If SQL is not your strength, prioritize placing foldable steps early in your query. This approach helps ensure that your data source handles most of the transformations.
  • When your query steps no longer fold, the next focus should be reducing your query’s memory usage. That means filtering rows and removing columns as soon as possible....

Summary

In this chapter, we have learned the multi-faceted approach required to optimize performance in Power Query. We have delved into areas such as the effective use of query folding and strategies for navigating the formula firewall. The difference between buffering and streaming operations, along with the careful use of buffer functions and the importance of reducing memory usage in your queries, were important areas of focus.

Key to this chapter is the understanding that the careful management of memory aspects helps improve query performance. By integrating these concepts and approaches, you are now equipped with a comprehensive understanding of the various elements that contribute to the performance optimization of your Power Query operations.

In the following chapter, we will focus on working with extensions. You’ll learn the fundamental concepts required to create a custom connector and what tools are involved in the process.

Learn more on Discord

...
lock icon
The rest of the chapter is locked
You have been reading a chapter from
The Definitive Guide to Power Query (M)
Published in: Mar 2024Publisher: PacktISBN-13: 9781835089729
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (3)

author image
Gregory Deckler

Greg Deckler is a 7-time Microsoft MVP for Data Platform and an active blogger and Power BI community member, having written over 6,000 solutions to community questions. Greg has authored many books on Power BI, including Learn Power BI 1st and 2nd Editions, DAX Cookbook, Power BI Cookbook 2nd Edition and Mastering Power BI 2nd Edition. Greg has also created several external tools for Power BI and regularly posts video content to his YouTube channels, Microsoft Hates Greg and DAX For Humans.
Read more about Gregory Deckler

author image
Rick de Groot

Rick de Groot was born in the Netherlands and has been working in BI for more than 14 years. He went freelance in 2016 and now works as an independent Power BI consultant. On his mission to make Power BI more accessible, he started two blogs: BI Gorilla and PowerQuery. how, and a YouTube channel sharing Power Query and Power BI content. His commitment to offering free content through multiple platforms has led him to earning the Microsoft Data Platform MVP award for two consecutive years.
Read more about Rick de Groot

author image
Melissa de Korte

Melissa de Korte's approach to facing challenges is fueled by relentless curiosity. She is a dedicated community member and content creator. Her portfolio includes blogs, tutorials, courses, and webinars, that make Power Query M more accessible and useful for all. Behind her professional persona lies a genuine dedication to empowering others through education and knowledge sharing, and a desire to encourage professionals to embrace the potential of Power Query, M.
Read more about Melissa de Korte