Reader small image

You're reading from  Mastering Microsoft Power BI – Second Edition - Second Edition

Product typeBook
Published inJun 2022
PublisherPackt
ISBN-139781801811484
Edition2nd Edition
Right arrow
Authors (2):
Gregory Deckler
Gregory Deckler
author image
Gregory Deckler

Greg Deckler is Vice President of the Microsoft Practice at Fusion Alliance and has been a professional technology systems consultant for over 25 years. Internationally recognized as an expert in Power BI, Greg Deckler is a Microsoft MVP for Data Platform and a superuser within the Power BI community with over 100,000 messages read, more than 11,000 replies, over 2,300 answers, and more than 75 entries in the Quick Measures Gallery. Greg founded the Columbus Azure ML and Power BI User Group (CAMLPUG) and presents at numerous conferences and events, including SQL Saturday, DogFood, and the Dynamic Communities User Group/Power Platform Summit.
Read more about Gregory Deckler

Brett Powell
Brett Powell
author image
Brett Powell

Brett Powell is the owner of and business intelligence consultant at Frontline Analytics LLC, a data and analytics research and consulting firm and Microsoft Power BI partner. He has worked with Power BI technologies since they were first introduced as the PowerPivot add-in for Excel 2010 and has been a Power BI architect and lead BI consultant for organizations across the retail, manufacturing, and financial services industries. Additionally, Brett has led Boston's Power BI User Group, delivered presentations at technology events such as Power BI World Tour, and maintains the popular Insight Quest Microsoft BI blog.
Read more about Brett Powell

View More author details
Right arrow

Preparing Data Sources

This chapter follows on from the dataset planning process described in Chapter 1, Planning BI Projects, by providing guidance on how to prepare for connecting to and transforming data using Power Query (M) queries. Power Query queries are written in a data transformation language commonly called “M” or can be generated via the Power Query Editor user interface. These queries access data sources and optionally apply data transformation logic to prep the tables for the Power BI data model.

As mentioned in Chapter 1, Planning BI Projects, to the greatest extent possible data transformation processes should be implemented within data sources such as Azure SQL and Azure Synapse SQL rather than via Power BI’s data transformation capabilities. The presence of significant data transformation logic (for example, joins, filters, and new columns) outside of an organization’s primary data warehouse or “source of truth” makes...

Query folding

Query folding is one of the most powerful and important capabilities of the M language as it translates M expressions into equivalent query statements for the given source system to process. With query folding, Power Query (M) serves as a rich abstraction layer for defining both simple and complex data transformation processes while still leveraging the compute resources of the source system. When implementing any remaining logic or data transformations via M functions, a top priority of the dataset designer is to ensure that these operations are folded to the data source.

In the following M query shown in Figure 2.1, a Table.RemoveColumns() M function is applied against the SQL view for the Internet Sales fact table to exclude three columns that are not needed for the dataset:

Graphical user interface, application  Description automatically generated

Figure 2.1: Power Query Editor: View Native Query

The additional step is translated to a SQL query that simply doesn’t select the three columns. The specific SQL statement...

Query design per dataset mode

As mentioned in Chapter 1, Planning BI Projects, to the greatest extent possible data transformation processes should be implemented within data sources such as Azure SQL and Azure Synapse Analytics rather than via Power BI’s data transformation capabilities.

The presence of significant data transformation logic, such as joins, filters, and new columns, outside of an organization’s primary data warehouse or “source of truth” makes these solutions more difficult to understand and support. In addition, source systems are generally provisioned with more compute resources to handle data transformations and often include secondary data structures, like indexes, that speed up certain operations such as filters and joins.

If resource or time constraints make it necessary to apply data transformations in Power BI rather than source systems, Power Query (M) should generally be favored over DAX calculated columns and tables...

Data sources

Data source connectivity is one of the strengths of Power BI, due to the vast list of standard data source connectors included in Power BI Desktop. In addition, numerous industry standards such as Open Data Protocol (OData), Open Database Connectivity (ODBC), and Object Linking and Embedding Database (OLE DB) are supported. The breadth of data connectivity options is further bolstered by the ability for developers to create custom Power BI data connectors for a specific application, service, or data source.

Although a Power BI dataset can connect to multiple sources ranging from Azure Synapse Analytics to a text file, solution architects and developers should strive to build solutions on a single, well-supported source such as a data warehouse database system that already contains the necessary data integrations as well as reflects data integrity constraints and quality processes. Power BI datasets that connect to several distinct sources, and particularly to less...

SQL views

As described in the Dataset planning section of Chapter 1, Planning Power BI Projects, a set of SQL views should be created within the data source and these SQL views, rather than the database tables, should be accessed by the Power BI dataset. SQL views are essentially virtual tables that provide an abstraction layer from the underlying database tables. SQL views can be used to merge database tables and to limit the number of columns, thus preventing such transformations from occurring within Power Query queries.

Each fact and dimension table required by the Power BI dataset should have its own SQL view and its own M query within the dataset that references this view. The SQL views should preferably be assigned to a dedicated database schema and identify the dimension or fact table represented as shown in Figure 2.11:

Figure 2.11: Views assigned to BI schema in SQL Server

A common practice is to create a database schema specific to the given dataset being...

Summary

In this chapter, we’ve covered a number of important concepts, design principles, data source settings, and source system preparation. This includes query folding, query design considerations per dataset mode, important Power BI Desktop configuration settings, data source privacy levels, and the layer of SQL views within a database source.

Understanding these concepts and properly preparing source systems greatly aids the process of connecting to and transforming data using Power BI.

In the next chapter, we’ll leverage the prepared source systems and design techniques described in this chapter to connect to source systems and transform their data using Power Query (M).

Join our community on Discord

Join our community’s Discord space for discussions with the author and other readers: https://discord.gg/q6BPbHEPXp

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Mastering Microsoft Power BI – Second Edition - Second Edition
Published in: Jun 2022Publisher: PacktISBN-13: 9781801811484
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Authors (2)

author image
Gregory Deckler

Greg Deckler is Vice President of the Microsoft Practice at Fusion Alliance and has been a professional technology systems consultant for over 25 years. Internationally recognized as an expert in Power BI, Greg Deckler is a Microsoft MVP for Data Platform and a superuser within the Power BI community with over 100,000 messages read, more than 11,000 replies, over 2,300 answers, and more than 75 entries in the Quick Measures Gallery. Greg founded the Columbus Azure ML and Power BI User Group (CAMLPUG) and presents at numerous conferences and events, including SQL Saturday, DogFood, and the Dynamic Communities User Group/Power Platform Summit.
Read more about Gregory Deckler

author image
Brett Powell

Brett Powell is the owner of and business intelligence consultant at Frontline Analytics LLC, a data and analytics research and consulting firm and Microsoft Power BI partner. He has worked with Power BI technologies since they were first introduced as the PowerPivot add-in for Excel 2010 and has been a Power BI architect and lead BI consultant for organizations across the retail, manufacturing, and financial services industries. Additionally, Brett has led Boston's Power BI User Group, delivered presentations at technology events such as Power BI World Tour, and maintains the popular Insight Quest Microsoft BI blog.
Read more about Brett Powell