Reader small image

You're reading from  Expert Data Modeling with Power BI - Second Edition

Product typeBook
Published inApr 2023
PublisherPackt
ISBN-139781803246246
Edition2nd Edition
Right arrow
Author (1)
Soheil Bakhshi
Soheil Bakhshi
author image
Soheil Bakhshi

Soheil Bakhshi is the founder of Data Vizioner and is a sought after BI consultant. Working in data and analytics for more than 20 years, Soheil's experience lies in Microsoft BI, Data Warehousing, and Power BI platform. He possesses MSCE, MCSA certifications and is a Microsoft MVP (Most Valuable Professional). He has a passion for sharing knowledge via his website and speaking at conferences and Power BI community events locally and globally. In following his desire for simplicity and efficiency, he is behind Power BI community tools and commercial products such as Power BI Exporter and Power BI Documenter.
Read more about Soheil Bakhshi

Right arrow

Getting Data from Various Sources

In previous chapters, we discussed some aspects of data modeling, including various layers in Power BI, how the data flows between different layers, virtual tables in Data Analysis eXpressions (DAX), and how they relate to data modeling. We also discussed leveraging query parameters and creating custom functions in the Power Query Editor with the Power Query formula language.

This chapter discusses getting data from various sources and describes some known challenges and common pitfalls in getting data from some sources. We also discuss data source certification and different connection modes to various data sources.

In this chapter, you will learn about the following topics:

  • Getting data from common data sources
  • Understanding data source certification
  • Working with connection modes
  • Working with storage modes
  • Understanding dataset storage modes

Getting data from common data sources

With Power BI, we can connect to many different data sources. In this section, we look at some common data sources that we can use in Power BI. We also look at common pitfalls when connecting to those data sources. But before we start, let us take a moment to discuss a common misunderstanding among many Power BI developers and users on what get data means. When we say get data, we refer to connecting to a data source from the Power Query Editor, regardless of the data source type. Then, we create some transformation steps to prepare the data to be imported into the data model.

While working in the Power Query Editor, we have not imported any data into the data model unless we click the Close and Apply button from the Home tab on the ribbon bar (or by clicking the Apply drop-down button). After clicking the Close and Apply button, data gets imported into the data model. At this point, you may ask: what is the data shown in the Data preview...

Understanding data source certification

While data source certification leans toward Power BI governance, it is crucial to understand what it is and how it affects data modelers. The data source certification is more about the quality of data and assessing the level of trust we can build upon the data available in different data sources. With data source certification, we group our data sources into three (or more) categories. Different organizations use different terminology to refer to the quality of their data. This section uses the Bronze, Silver, and Gold/Platinum categories, indicating the quality of data contained in the data sources. You may use different terms in your data source certification.

This section does not explain the steps and processes of data source certification as it is beyond the scope of this book.

Bronze

The Bronze data sources contain uncurated data. The data has never been thoroughly quality-controlled. While the data source may have valuable...

Working with connection modes

When we connect to a data source, the query connection mode falls into one of the following three different categories:

  • Data Import
  • DirectQuery
  • Connect Live

Every query connecting to one or more data sources in the Power Query Editor has one of the preceding connection modes, except Connect Live. When the connection mode is Connect Live, a Power BI model cannot currently connect to more than one instance of SSAS or AAS, or a Power BI dataset. As mentioned earlier, in December 2020, Microsoft released the preview of the new version of the composite models in which we can connect to multiple Power BI Datasets or AAS/SSAS tabular 2022 models when using the DirectQuery mode. We discuss composite models in more detail in Chapter 14, DirectQuery Connections to Power BI Datasets and Analysis Services in Composite Models.

This section looks at the connection modes, their applications, and their limitations.

Data Import

...

Working with storage modes

In the previous section, we discussed various connection modes for the queries from a Power Query perspective. This section looks at different storage modes that apply to tables after the data is loaded into a data model or connected to a data source. Every table in a Power BI data model has a storage mode property that shows if the data is cached or not. There are three types of storage modes, as outlined next:

  • Import: This means the data is cached in the memory. Therefore, all queries over a table with this storage mode get the results from the cached data.
  • DirectQuery: The data is not cached; therefore, all queries fire back to the source system.
  • Dual: The tables in this mode can get data from the cache or the source system. For instance, depending on the level of detail into which the user drills down in an aggregation setup, the query results may come from the cached data or directly from the source. We discuss aggregations in...

Understanding dataset storage modes

As you may have already guessed, dataset storage modes refer to whether the data in a dataset is cached in memory or not. With that in mind, from a dataset perspective, there are three different modes, as outlined next:

  • Import: When the whole data is cached in memory. In this mode, all tables are in the Import storage mode setting.
  • DirectQuery: When the data is not cached in memory. In this mode, all tables are in the DirectQuery storage mode setting.
  • Composite (Mixed): When a portion of data is cached in memory, while the rest is not. In this mode, some tables are in the Import storage mode setting; other tables are in DirectQuery storage mode or the Dual storage mode setting.

To see and edit the dataset storage modes in Power BI Desktop, look at the right side of the status bar, as shown in the following image:

Graphical user interface  Description automatically generated with medium confidence

Figure 4.59: Identifying dataset storage modes in Power BI Desktop

Dataset storage mode is...

Summary

In this chapter, we learned how to work with the most common data sources supported in Power BI, such as folders, CSV, Excel, Power BI datasets, Power BI dataflows, SQL Server SSAS instances, and an OData feed, with some challenging real-world scenarios. We also went through data source certifications and discussed why it is essential to know which data source certification level we are dealing with. We then looked at connection, storage, and dataset modes and how different they are. It is worthwhile emphasizing the importance of understanding different connection modes, storage modes, and dataset modes as they directly affect our data modeling and overall Power BI architecture.

In the next chapter, we look at common data preparation steps in the Power Query Editor and real-world scenarios.

Join us on Discord!

Join The Big Data and Analytics Community on the Packt Discord Server!

Hang out with 558 other members and enjoy free voice and text chat.

https...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Expert Data Modeling with Power BI - Second Edition
Published in: Apr 2023Publisher: PacktISBN-13: 9781803246246
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Author (1)

author image
Soheil Bakhshi

Soheil Bakhshi is the founder of Data Vizioner and is a sought after BI consultant. Working in data and analytics for more than 20 years, Soheil's experience lies in Microsoft BI, Data Warehousing, and Power BI platform. He possesses MSCE, MCSA certifications and is a Microsoft MVP (Most Valuable Professional). He has a passion for sharing knowledge via his website and speaking at conferences and Power BI community events locally and globally. In following his desire for simplicity and efficiency, he is behind Power BI community tools and commercial products such as Power BI Exporter and Power BI Documenter.
Read more about Soheil Bakhshi