Reader small image

You're reading from  Expert Data Modeling with Power BI - Second Edition

Product typeBook
Published inApr 2023
PublisherPackt
ISBN-139781803246246
Edition2nd Edition
Right arrow
Author (1)
Soheil Bakhshi
Soheil Bakhshi
author image
Soheil Bakhshi

Soheil Bakhshi is the founder of Data Vizioner and is a sought after BI consultant. Working in data and analytics for more than 20 years, Soheil's experience lies in Microsoft BI, Data Warehousing, and Power BI platform. He possesses MSCE, MCSA certifications and is a Microsoft MVP (Most Valuable Professional). He has a passion for sharing knowledge via his website and speaking at conferences and Power BI community events locally and globally. In following his desire for simplicity and efficiency, he is behind Power BI community tools and commercial products such as Power BI Exporter and Power BI Documenter.
Read more about Soheil Bakhshi

Right arrow

Star Schema and Data Modeling Common Best Practices

In the previous chapter, we learned about data modeling components in Power BI Desktop, including table and field properties. We also learned about featured tables and how they make dataset tables accessible across an organization. We then learned how to build summary tables with DAX. We also looked at the relationships in more detail; we learned about different relationship cardinalities, filter propagation, and bidirectional relationships. In this chapter, we look at some star schema and data modeling best practices, including the following:

  • Dealing with many-to-many relationships
  • Avoiding bidirectional relationships
  • Dealing with inactive relationships
  • Using configuration tables
  • Avoiding calculated columns when possible
  • Organizing the model
  • Reducing model size by disabling auto date/time

In this chapter, we use the Chapter 9, Star Schema and Data Modeling Common Best Practices...

Dealing with many-to-many relationships

In the previous chapter, Chapter 8, Data Modeling Components, we discussed different relationship cardinalities. We went through some scenarios to understand one-to-one, one-to-many, and many-to-many relationships. We showed examples of creating a many-to-many relationship between two tables using non-key columns.

While creating a many-to-many relationship may work for smaller and less complex data models, it can cause severe issues and ambiguities in more complex models. In some cases, we may get incorrect results in totals; we might find some missing values or get poor performance in large models; while in other cases, we may find the many-to-many cardinality very useful.

The message here is that, depending on the business case, we may or may not use many-to-many cardinality; it depends on what works best for our model to satisfy the business requirements.

For instance, the many-to-many cardinality works perfectly fine in our...

Avoiding bidirectional relationships

One of the most misunderstood and somehow misused Power BI features in data modeling is setting the Cross filter direction to Both. This is widely known as a bidirectional relationship. There is nothing wrong with setting a relationship to bidirectional if we know what we are doing and are conscious of its effects on the data model. We have seen Power BI developers who have many bidirectional relationships in their model and consequently end up with many issues, such as getting unexpected results in their DAX calculations or being unable to create a new relationship due to ambiguity.

The reason that overusing bidirectional relationships increases the risk of having an ambiguous model is filter propagation. In Chapter 8, Data Modeling Components, we covered the concept of filter propagation and bidirectional relationships. We looked at a scenario where the developer needed to have two slicers on a report page, one for the product category and...

Dealing with inactive relationships

In real-world scenarios, the data models can get very busy, especially when we are creating a data model to support enterprise BI; there are many instances where we have an inactive relationship in our data model. In many cases, there are two reasons that a relationship is inactive, as follows:

  • The table with an inactive relationship is reachable via multiple filter paths.
  • There are multiple direct relationships between two tables.

In both preceding cases, the xVelocity engine does not allow us to activate an inactive relationship to avoid ambiguity across the model.

Reachability via multiple filter paths

A multiple filter path between two tables means that the two tables are related and can be reached via other related tables. Therefore, the filter propagates from one table to another via multiple hops (relationships). The following diagram shows a data model with an inactive relationship:

Figure 9.14...

Using configuration tables

In many cases, a business wants to analyze some of the business metrics in clusters. Some good examples are analyzing sales by unit price range, analyzing sales by product cost range, analyzing customers by their age range, or analyzing customers by commute distance. In all of these examples, the business does not need to analyze constant values; instead, it is more about analyzing a metric (sales, in the preceding examples) by a range of values.

Some other cases are related to data visualization, such as dynamically changing the color of values when they are in a specific range. An example is to change the values’ color to red in all visuals analyzing sales if the sales value for the data points is less than the average sales over time. This is a relatively advanced analysis that can be reused in our reports to keep visualizations’ color consistent.

In the preceding examples, we need to define configuration tables. In the latter example...

Avoiding calculated columns when possible

Creating calculated columns is one of the most essential and powerful features of DAX. Calculated columns, as the name suggests, are computed based on a formula; therefore, the calculated column values are unavailable in the source systems or the Power Query layer. The values of the calculated columns are computed during the data refresh and then stored in memory. It is important to note that the calculated columns reside in memory unless we unload the whole data model from memory, which in Power BI means when we close the file in Power BI Desktop or switch to other content in the Power BI Service. Calculated columns, after creation, are just like any other columns, so we can use them in other calculated columns, measures, calculated tables, or for filtering the visualization layer.

A common approach of developers is to use calculated columns to divide complex equations into smaller chunks. That is why we suggest avoiding the excessive...

Organizing the model

There are usually several roles involved in a Power BI project in real-world enterprise BI scenarios. From a Power BI development perspective, we might have data modelers, report writers, quality assurance specialists, support specialists, and so on. The data modelers are the ones who make the data model available for all other content creators, such as report writers. So, making a model that is as organized as possible is essential. This section looks at several ways to organize our data models.

Hiding insignificant model objects

One way to keep our model tidier is to hide all insignificant objects from the data model. We often have some objects in the data model that are not used elsewhere. However, we cannot remove them from the data model as we may require them in the future. So, the best practice is to hide all those objects unless they are going to serve a business requirement. The following sections discuss the best candidate objects for hiding...

Reducing model size by disabling auto date/time

When the data is loaded into the data model, Power BI automatically creates some Date tables to support calendar hierarchies for all columns in the DateTime datatype. This feature is convenient, especially for beginners who do not know how to create a Date table or create and manage hierarchies. However, it can consume too much storage, potentially leading to severe performance issues. As mentioned earlier, the auto date/time feature forces Power BI Desktop to create Date tables for every single DateTime column within the model. The Date tables have the following columns:

  • Date
  • Year
  • Quarter
  • Month
  • Day

The last four columns create date hierarchies for each DateTime column. The Date column in the created Date table starts from January 1 of the minimum year of the related column in our tables. It ends on December 31 of the maximum number of years for that column. It is a common practice in data...

Summary

This chapter taught us some common best practices for working with Star Schema and data modeling. We learned how to implement many-to-many relationships. We also learned how and when to use bidirectional relationships. Then we looked at disabled relationships and how we can programmatically enable them. We also learned about config tables and how they can help us with our data visualization. We then discussed why and when we should avoid using calculated columns. Next, we looked at some techniques to organize the data model. Finally, we learned how we could reduce the model size by disabling the Auto date/time for new files feature in Power BI Desktop.

In the next chapter, Advanced Modeling Techniques, we will discuss some exciting data modeling techniques that can boost our Power BI model performance while creating complex models. See you there.

Join us on Discord!

Join The Big Data and Analytics Community on the Packt Discord Server!

Hang out with 558 other...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Expert Data Modeling with Power BI - Second Edition
Published in: Apr 2023Publisher: PacktISBN-13: 9781803246246
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Soheil Bakhshi

Soheil Bakhshi is the founder of Data Vizioner and is a sought after BI consultant. Working in data and analytics for more than 20 years, Soheil's experience lies in Microsoft BI, Data Warehousing, and Power BI platform. He possesses MSCE, MCSA certifications and is a Microsoft MVP (Most Valuable Professional). He has a passion for sharing knowledge via his website and speaking at conferences and Power BI community events locally and globally. In following his desire for simplicity and efficiency, he is behind Power BI community tools and commercial products such as Power BI Exporter and Power BI Documenter.
Read more about Soheil Bakhshi