You're reading from Data Cleaning with Power BI

Product typeBook

Published inFeb 2024

PublisherPackt

ISBN-139781805126409

Edition1st Edition

Concepts

Data Analysis

Author (1)

Gus Frazer

M Query Optimization

In the previous chapter, you were introduced to custom functions, appropriate planning for functions, and also using the power of parameters within your functions. Across Power Query, there are a number of opportunities for you to express and code solutions with your knowledge of M code.

Like with any coding language, there are different ways in which you can write out your code, meaning there are often strategies and techniques that can help you achieve the optimum performance of your query. In this chapter, you will dive deep into four key tips for optimizing your M queries:

Filtering and reducing the data you’re working on to improve performance, such as removing unnecessary data early in the query
Recognizing when to use functions optimized for specific tasks, reducing the need for custom code
Employing lazy evaluation techniques and functions such as Table.Buffer and Table.Distinct to optimize memory usage and processing
Using...

Technical requirements

To follow the instructions in this chapter, you will need to connect to the data using the following URL: https://raw.githubusercontent.com/PacktPublishing/Data-Cleaning-with-Power-BI/main/Retail%20Store%20Sales%20Data.csv.

Creating custom functions

As we learned in the previous chapter, while native functions are powerful, there are scenarios where custom functions are necessary. In some scenarios, it might not even make sense to add the custom function through the Queries tab as it might only be used once. With that in mind, it sometimes makes more sense to build that custom function into the existing code within the advanced editor. We’ll delve deeper into the process of creating efficient custom functions in M during this section.

As an example, start by building a table from a blank query. Follow these steps to do this:

Open Power BI Desktop.
Select Transform data from the Report, Table, or Model View to open the Power Query editor.
Select New source and then select Blank query from the dropdown. Select Advanced editor from the Home ribbon to open the Advanced editor.
Enter the following code into the Advanced editor:
```
let
```
```
Scorecard = Table.FromList(
```
```
   &...
```

Filtering and reducing data

Efficiently handling large datasets is a common challenge in data transformation. Filtering and reducing data early in your query can significantly boost performance.

Let’s walk through an example where we filter out unnecessary data. Consider a dataset with sales information for the past five years. If your analysis only requires data from the last year, filtering out the older records early in the query can save processing time and memory usage:

Open Power BI Desktop. We’re going to be connecting to a new dataset.
Click on Get Data | Text CSV. You will then enter the following URL to access the file for our example: https://raw.githubusercontent.com/PacktPublishing/Data-Cleaning-with-Power-BI/main/Retail%20Store%20Sales%20Data.csv
Select Transform Data instead of Load Now so that you can explore the data before loading it all into memory.
Rename the query Retail Store Sales Data - Problem Statement 1 using the Properties...

Using native M functions

As highlighted earlier in this book, M, the language behind Power Query, offers a rich library of native functions designed for data transformation. Leveraging these functions can often be more efficient than custom code.

For instance, let’s say you need to standardize product names by converting them to title case. Instead of writing custom code, you can utilize the Text.ToTitleCase function, making your query more concise and performant.

Here is an example of doing just this:

let
    Source = ... // Your data source
    StandardizedData = Table.TransformColumns(Source, {"ProductName", Text.ToTitleCase})
in
    StandardizedData

In this code, we use the Table.TransformColumns function along with the Text.ToTitleCase function to standardize product names. Native functions are highly optimized for their specific tasks, resulting in more efficient and faster queries.

The...

Optimizing memory usage

Managing memory usage is vital for query optimization. Let’s consider an example where you’re dealing with a large dataset with repeated data. Instead of creating multiple copies of the same data, we can explore the Table.Buffer function.

This function loads a table into memory once, reducing memory duplication, which can lead to improved query speed, especially for large datasets with repeated data. This optimization can result in more efficient use of system resources and better overall performance during data transformation and analysis tasks.

On the other hand, though, there are some potential drawbacks to be aware of. One significant downside is that using Table.Buffer can actually slow down performance in certain scenarios.

One reason for this is that it loads the entire table into memory at once. For very large datasets, this can consume a significant amount of memory resources, potentially leading to memory pressure and slower...

Parallel query execution

Parallel query execution is a game-changer for performance. Suppose you have a massive dataset, and you want to process it more quickly. By splitting the table into smaller parts using Table.Split, you can enable parallel processing, drastically reducing query execution times.

This can be done using code such as the following in your M query:

let
    Source = ... // Your data source
    SplitTable = Table.Split(Source, 4) // Split the table into 4 partitions
in
    SplitTable

In this code, we use the Table.Split function to divide a large table into smaller partitions, enabling parallel processing. Each partition is processed simultaneously, which can often result in significantly reducing query execution times. It must be said, though, that using Table.Split in this code on its own won’t inherently reduce query execution times, but it’s a step towards enabling parallel processing...

Summary

In this chapter, we’ve covered several crucial techniques to enhance the performance of your Power Query workflows.

We began by emphasizing the importance of efficient data filtering and reduction, encouraging you to remove unnecessary data early in your query. We explored the use of native M functions, highlighting their efficiency compared to custom code for specific tasks. Optimizing custom functions was the next focus; we learned to optimize calculations not covered by native functions. The chapter also touched on the significance of optimizing memory usage, introducing Table.Buffer and other memory-efficient coding practices.

We then delved into the game-changing concept of parallel query execution, showcasing how functions such as Table.Split can drastically reduce query execution times by dividing large tables into smaller partitions and enabling parallel processing. These techniques will empower you to tackle complex data transformation tasks in Power BI...

Questions

What were the four key tips for optimizing M queries discussed this chapter?
1. Sorting, filtering, grouping, summarizing
2. Filtering and reducing data, using native M functions, creating custom functions, optimizing memory usage
3. Conditional formatting, joins, aggregations, pivoting
4. Concatenation, union, splitting, transforming
In the custom function for calculating the weighted average, what are the parameters of the function, and how is the weighted average calculated?
1. Parameters: table, weights, values; the weighted average is calculated by summing the weighted values and dividing by the total weight
2. Parameters: table, columns; the weighted average is calculated by multiplying values by weights
3. Parameters: values, weights, total; the weighted average is calculated by summing the values and dividing by the weights
4. Parameters: rows, weights, values; the weighted average is calculated by summing the values and multiplying by the weights
How can Table.Buffer optimize the...

Gus Frazer is a seasoned analytics consultant who focuses on business intelligence solutions. With over eight years of experience working for the two market-leading platforms, Power BI (Microsoft) and Tableau, he has amassed a wealth of knowledge and expertise. He also has experience in helping hundreds of customers to drive their digital and data transformations, scope data requirements, drive actionable insights, and most important of all, clean data ready for analysis.
Read more about Gus Frazer

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages

You're reading from Data Cleaning with Power BI

M Query Optimization

Technical requirements

Creating custom functions

Filtering and reducing data

Using native M functions

Optimizing memory usage

Parallel query execution

Summary

Questions

Further reading

Unlock this book and the full library FREE for 7 days

Author (1)

Et al.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Mastering Tableau 2023

Building AI Applications with ChatGPT APIs

Building AI Applications with ChatGPT APIs

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

Modern Data Architecture on AWS

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

TinyML Cookbook