Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Data Modeling with Microsoft Excel

You're reading from  Data Modeling with Microsoft Excel

Product type Book
Published in Nov 2023
Publisher Packt
ISBN-13 9781803240282
Pages 316 pages
Edition 1st Edition
Languages
Author (1):
Bernard Obeng Boateng Bernard Obeng Boateng
Profile icon Bernard Obeng Boateng

Table of Contents (16) Chapters

Preface 1. Part 1: Overview and Introduction to Data Modeling in Microsoft Excel
2. Chapter 1: Getting Started with Data Modeling – Overview and Importance 3. Chapter 2: Data Structuring for Data Models – What’s the best way to layout your data? 4. Chapter 3: Preparing Your Data for the Data Model – Cleaning and Transforming Your Data Using Power Query 5. Chapter 4: Data Modeling with Power Pivot – Understanding How to Combine and Analyze Multiple Tables Using the Data Model 6. Part 2: Creating Insightful Calculations from your Data Model using DAX and Cube Functions
7. Chapter 5: Creating DAX Calculations from Your Data Model – Introduction to Measures and Calculated Columns 8. Chapter 6: Creating Cube Functions from Your Data Model – a Flexible Alternative to Calculations in Your Data Model 9. Part 3: Putting it all together with a Dashboard
10. Chapter 7: Communicating Insights from Your Data Model Using Dashboards – Overview and Uses 11. Chapter 8: Visualization Elements for Your Dashboard – Slicers, PivotCharts, Conditional Formatting, and Shapes 12. Chapter 9: Choosing the Right Design Themes – Less Is More with Colors 13. Chapter 10: Publication and Deployment – Sharing with Report Users 14. Index 15. Other Books You May Enjoy

Data Structuring for Data Models – What’s the best way to layout your data?

How do you structure your data for the data model? Getting your data in the right shape and form is important in the data modeling process. In this chapter, we will learn about the three golden rules for data structuring. These rules make it easier to model your data and create your calculations and analysis in Microsoft Excel. You will understand the concept of data normalization and the advantages it has over flat tables for analysis. The chapter will round off with an introduction to Power Query, Microsoft Excel’s data transformation tool. When you encounter data layouts that do not conform to the best practices described in the chapter, Power Query plays a critical role in shaping and transforming the data.

The following topics will be covered in this chapter:

  • Data structuring – understanding the three golden rules
  • Denormalized and normalized data
  • Understanding...

Data structuring – understanding the three golden rules

To get the best out of Power Pivot, we need to lay out our dataset in a way that will improve performance and help us get the right insights from our datasets. The following figure summarizes the ideal layout for a single dataset or table:

Figure 2.1 – The three golden rules for laying out your data in a single dataset

Figure 2.1 – The three golden rules for laying out your data in a single dataset

Let’s look at each of these rules in detail:

  • Rule 1: Each column should have a single data type. As we learned in Chapter 1, Getting Started with Data Modeling, Power Pivot manages data using a columnar structure. It is therefore important to commit one data type to each column of your dataset.

    For example, if you have data for a payroll containing the names of staff members, one column should be dedicated to the names of staff with the same format and no other data type. Apart from this, there shouldn’t be any column in your dataset that contains...

Denormalized and normalized data

Denormalized data combines redundant data into one table while normalized data stores related data in separate tables. Each type has its pros and cons when analyzing data. However, for data modeling, normalized data is ideal. We will go into the process of converting denormalized data into normalized data in the next section.

The following table lists the important differences between normalization and denormalization:

Understanding table relationships

In a data model, relationships refer to the connections between different entities or tables within a database. There are three types of relationships in a data model: one-to-many, many-to-many, and one-to-one. Let’s discuss each type in detail along with some examples.

One-to-many relationship

In a one-to-many relationship, one record in a table is associated with multiple records in another table. This is the most common type of relationship in a database.

For example, a customer may place many orders, but each order is associated with only one customer. In this case, the customer table is on the one side of the relationship, and the order table is on many sides. The relationship is established by creating a foreign key in the order table that refers to the primary key in the customer table.

Another example is the relationship between a department table and an employee table. A department may have many employees, but each employee...

Understanding dimension and fact tables

In a relational database, a fact table is a table that stores quantitative information or facts about a business process or activity, such as sales, inventory, or customer transactions. A fact table typically contains numerical values and foreign keys to link to dimension tables.

On the other hand, a dimension table is a table that stores descriptive information about the objects, events, or entities in a business process or activity. Dimension tables are typically used to provide context and structure to the data in a fact table.

To understand this better, let’s consider an example of a retail business that sells products through its online store. In this example, we can identify the following fact and dimension tables.

Fact table – sales

The sales fact table would contain quantitative information about sales transactions, such as sales revenue, quantity sold, and price. The sales fact table would typically contain...

Understanding the role of primary keys and foreign keys

Keys are very important in a relational database because they help create a relationship between two tables and ensure that each record in a table can be identified in a unique way. When we connect the related tables, the version of the connecting columns in the dimension tables is called the primary key while the version in the fact table is called the foreign key.

In a relational database, the primary key helps us identify a single row in a table. All primary keys must be unique in the columns where they sit. Foreign keys help connect our dimension and fact tables and, more importantly, make our fact tables accessible in our calculations. Foreign keys are typically not unique.

Going back to our mock view of our relational database, we can add primary keys and foreign keys, as shown here:

Figure 2.8: Assigning primary keys and foreign keys

Figure 2.8: Assigning primary keys and foreign keys

In our data model, these primary keys will be linked...

Summary

This chapter has introduced you to best practices in structuring your data for data modeling. In the real world, the data produced from most relational database management systems is already normalized. However, there will be instances where you would need to convert denormalized data into normalized data to improve calculations and reduce redundancy. After reading this chapter, you are now able to appreciate the steps you need to take to achieve this. The chapter also covered the key differences between primary keys and foreign keys and the role they play in our data model.

In the next chapter, we will learn about Power Query and how we can use this tool to shape and transform data for our data model.

Key terms in this chapter

  • Data redundancy
  • Normalization
  • Denormalization
  • Primary keys
  • Foreign keys

Questions for discussion

  • What are the key considerations for laying out data properly for the data model?
  • When is it best to consider performing data normalization?
  • List the key advantages normalized data has over denormalized data.
lock icon The rest of the chapter is locked
You have been reading a chapter from
Data Modeling with Microsoft Excel
Published in: Nov 2023 Publisher: Packt ISBN-13: 9781803240282
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime}

Criteria

Normalized Data

Denormalized Data

Definition

Data is organized in such a way that each piece or dimension of the data is stored in only one place in separate tables.

Data is organized in such a way that multiple pieces of information are stored together in one place.

Duplication

There is minimal duplication...