You're reading from Expert Data Modeling with Power BI - Second Edition

Product typeBook

Published inApr 2023

PublisherPackt

ISBN-139781803246246

Edition2nd Edition

Concepts

Business Intelligence

Author (1)

Soheil Bakhshi

Star Schema and Data Modeling Common Best Practices

In the previous chapter, we learned about data modeling components in Power BI Desktop, including table and field properties. We also learned about featured tables and how they make dataset tables accessible across an organization. We then learned how to build summary tables with DAX. We also looked at the relationships in more detail; we learned about different relationship cardinalities, filter propagation, and bidirectional relationships. In this chapter, we look at some star schema and data modeling best practices, including the following:

Dealing with many-to-many relationships
Avoiding bidirectional relationships
Dealing with inactive relationships
Using configuration tables
Avoiding calculated columns when possible
Organizing the model
Reducing model size by disabling auto date/time

In this chapter, we use the Chapter 9, Star Schema and Data Modeling Common Best Practices...

Dealing with many-to-many relationships

In the previous chapter, Chapter 8, Data Modeling Components, we discussed different relationship cardinalities. We went through some scenarios to understand one-to-one, one-to-many, and many-to-many relationships. We showed examples of creating a many-to-many relationship between two tables using non-key columns.

While creating a many-to-many relationship may work for smaller and less complex data models, it can cause severe issues and ambiguities in more complex models. In some cases, we may get incorrect results in totals; we might find some missing values or get poor performance in large models; while in other cases, we may find the many-to-many cardinality very useful.

The message here is that, depending on the business case, we may or may not use many-to-many cardinality; it depends on what works best for our model to satisfy the business requirements.

For instance, the many-to-many cardinality works perfectly fine in our...

Avoiding bidirectional relationships

One of the most misunderstood and somehow misused Power BI features in data modeling is setting the Cross filter direction to Both. This is widely known as a bidirectional relationship. There is nothing wrong with setting a relationship to bidirectional if we know what we are doing and are conscious of its effects on the data model. We have seen Power BI developers who have many bidirectional relationships in their model and consequently end up with many issues, such as getting unexpected results in their DAX calculations or being unable to create a new relationship due to ambiguity.

The reason that overusing bidirectional relationships increases the risk of having an ambiguous model is filter propagation. In Chapter 8, Data Modeling Components, we covered the concept of filter propagation and bidirectional relationships. We looked at a scenario where the developer needed to have two slicers on a report page, one for the product category and...

Dealing with inactive relationships

In real-world scenarios, the data models can get very busy, especially when we are creating a data model to support enterprise BI; there are many instances where we have an inactive relationship in our data model. In many cases, there are two reasons that a relationship is inactive, as follows:

The table with an inactive relationship is reachable via multiple filter paths.
There are multiple direct relationships between two tables.

In both preceding cases, the xVelocity engine does not allow us to activate an inactive relationship to avoid ambiguity across the model.

Reachability via multiple filter paths

A multiple filter path between two tables means that the two tables are related and can be reached via other related tables. Therefore, the filter propagates from one table to another via multiple hops (relationships). The following diagram shows a data model with an inactive relationship:

Figure 9.14...

Using configuration tables

In many cases, a business wants to analyze some of the business metrics in clusters. Some good examples are analyzing sales by unit price range, analyzing sales by product cost range, analyzing customers by their age range, or analyzing customers by commute distance. In all of these examples, the business does not need to analyze constant values; instead, it is more about analyzing a metric (sales, in the preceding examples) by a range of values.

Some other cases are related to data visualization, such as dynamically changing the color of values when they are in a specific range. An example is to change the values’ color to red in all visuals analyzing sales if the sales value for the data points is less than the average sales over time. This is a relatively advanced analysis that can be reused in our reports to keep visualizations’ color consistent.

In the preceding examples, we need to define configuration tables. In the latter example...

Avoiding calculated columns when possible

Creating calculated columns is one of the most essential and powerful features of DAX. Calculated columns, as the name suggests, are computed based on a formula; therefore, the calculated column values are unavailable in the source systems or the Power Query layer. The values of the calculated columns are computed during the data refresh and then stored in memory. It is important to note that the calculated columns reside in memory unless we unload the whole data model from memory, which in Power BI means when we close the file in Power BI Desktop or switch to other content in the Power BI Service. Calculated columns, after creation, are just like any other columns, so we can use them in other calculated columns, measures, calculated tables, or for filtering the visualization layer.

A common approach of developers is to use calculated columns to divide complex equations into smaller chunks. That is why we suggest avoiding the excessive...

Organizing the model

There are usually several roles involved in a Power BI project in real-world enterprise BI scenarios. From a Power BI development perspective, we might have data modelers, report writers, quality assurance specialists, support specialists, and so on. The data modelers are the ones who make the data model available for all other content creators, such as report writers. So, making a model that is as organized as possible is essential. This section looks at several ways to organize our data models.

Hiding insignificant model objects

One way to keep our model tidier is to hide all insignificant objects from the data model. We often have some objects in the data model that are not used elsewhere. However, we cannot remove them from the data model as we may require them in the future. So, the best practice is to hide all those objects unless they are going to serve a business requirement. The following sections discuss the best candidate objects for hiding...

Reducing model size by disabling auto date/time

When the data is loaded into the data model, Power BI automatically creates some Date tables to support calendar hierarchies for all columns in the DateTime datatype. This feature is convenient, especially for beginners who do not know how to create a Date table or create and manage hierarchies. However, it can consume too much storage, potentially leading to severe performance issues. As mentioned earlier, the auto date/time feature forces Power BI Desktop to create Date tables for every single DateTime column within the model. The Date tables have the following columns:

Date
Year
Quarter
Month
Day

The last four columns create date hierarchies for each DateTime column. The Date column in the created Date table starts from January 1 of the minimum year of the related column in our tables. It ends on December 31 of the maximum number of years for that column. It is a common practice in data...

Summary

This chapter taught us some common best practices for working with Star Schema and data modeling. We learned how to implement many-to-many relationships. We also learned how and when to use bidirectional relationships. Then we looked at disabled relationships and how we can programmatically enable them. We also learned about config tables and how they can help us with our data visualization. We then discussed why and when we should avoid using calculated columns. Next, we looked at some techniques to organize the data model. Finally, we learned how we could reduce the model size by disabling the Auto date/time for new files feature in Power BI Desktop.

In the next chapter, Advanced Modeling Techniques, we will discuss some exciting data modeling techniques that can boost our Power BI model performance while creating complex models. See you there.

Join us on Discord!

Join The Big Data and Analytics Community on the Packt Discord Server!

Hang out with 558 other...

The rest of the chapter is locked

You have been reading a chapter from

Expert Data Modeling with Power BI - Second Edition

Published in: Apr 2023Publisher: PacktISBN-13: 9781803246246

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Soheil Bakhshi

Soheil Bakhshi is the founder of Data Vizioner and is a sought after BI consultant. Working in data and analytics for more than 20 years, Soheil's experience lies in Microsoft BI, Data Warehousing, and Power BI platform. He possesses MSCE, MCSA certifications and is a Microsoft MVP (Most Valuable Professional). He has a passion for sharing knowledge via his website and speaking at conferences and Power BI community events locally and globally. In following his desire for simplicity and efficiency, he is behind Power BI community tools and commercial products such as Power BI Exporter and Power BI Documenter.
Read more about Soheil Bakhshi

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages