Home Data Data Modeling with Microsoft Excel

Data Modeling with Microsoft Excel

By Bernard Obeng Boateng
books-svg-icon Book
eBook $27.99 $18.99
Print $34.99
Subscription $15.99 $10 p/m for three months
$10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
BUY NOW $10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
eBook $27.99 $18.99
Print $34.99
Subscription $15.99 $10 p/m for three months
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
  1. Free Chapter
    Chapter 1: Getting Started with Data Modeling – Overview and Importance
About this book
Microsoft Excel's BI solutions have evolved, offering users more flexibility and control over analyzing data directly in Excel. Features like PivotTables, Data Model, Power Query, and Power Pivot empower Excel users to efficiently get, transform, model, aggregate, and visualize data. Data Modeling with Microsoft Excel offers a practical way to demystify the use and application of these tools using real-world examples and simple illustrations. This book will introduce you to the world of data modeling in Excel, as well as definitions and best practices in data structuring for both normalized and denormalized data. The next set of chapters will take you through the useful features of Data Model and Power Pivot, helping you get to grips with the types of schemas (snowflake and star) and create relationships within multiple tables. You’ll also understand how to create powerful and flexible measures using DAX and Cube functions. By the end of this book, you’ll be able to apply the acquired knowledge in real-world scenarios and build an interactive dashboard that will help you make important decisions.
Publication date:
November 2023
Publisher
Packt
Pages
316
ISBN
9781803240282

 

Getting Started with Data Modeling – Overview and Importance

Think of how a business plan lays out the written roadmap for companies to understand and make sense of all the moving parts of their business: the drivers, resources, and processes required to achieve success. This plan often serves as the manual companies consult to understand how all the pieces of the business puzzle fit together.

In the same way, large and complex datasets require a structure or a blueprint that allows data analysts to visualize how different data points can be structured and connected to deliver insights for action or decision making.

This underscores the significance of data modeling in the field of data analytics, and it is precisely where data modeling in Microsoft Excel proves invaluable.

In this first chapter of the book, we will break down the concept of data modeling within and beyond Microsoft Excel. The chapter will cover the advantages of using a data model to manage multiple sources of data. You will go on to understand some practical use cases on how to use the data model to look up and reference related tables and understand the architecture and features of Power Pivot, the engine for data modeling in Microsoft Excel. Throughout the journey, best practices will be highlighted and covered.

At the end of the chapter, you will be in a good position to understand how data modeling can help you connect and manage datasets from multiple resources to deliver insights quickly and efficiently in your data analytics project.

The following topics will be covered in this chapter:

  • Understanding the concept of data modeling
  • The importance of a data model in Microsoft Excel
  • Practical use cases for a data model
  • Introduction to Power Pivot in Excel
  • Best practices with Power Pivot
 

Understanding the concept of data modeling

Data modeling is the process of structuring and organizing data in a way that it can be easily analyzed and reported. Think of it like arranging books in a library. If you just threw all the books into a room, it would be hard to find what you need. But if you categorize them by genre, author, or publication date, it becomes much easier to locate a specific book.

Similarly, data modeling helps in organizing data so that you can easily derive insights from it.

Just as a business plan serves as a blueprint for a company, a data model acts as a blueprint for creating and visualizing the relationships between different datasets. This activity is known as data modeling.

It serves as the backbone for your visuals and calculations, allowing for more complex data analysis. A data model gives you a visual or conceptual view of how the datasets you are working with connect to produce the results or insights you need. Getting it right can be the difference between well-optimized data analytics and analytics filled with redundant data that offers little insight.

Microsoft offers the following definitions for a data model in Excel and Power BI:

  • A data model allows you to integrate data from multiple tables, effectively building a relational data source inside an Excel workbook.
  • Data modeling is the process of analyzing and defining all the different data types your business collects and produces, as well as the relationships between those bits of data. By using text, symbols, and diagrams, data modeling concepts create visual representations of data as it’s captured, stored, and used in your business. As your business determines how data is used and when the data modeling process becomes an exercise in understanding and clarifying your data requirements.

In Excel, a data model can help you connect to one or many tables and summarize the data with PivotTables.

Figure 1.1 – Comparing a one-table analysis to multiple-table analysis

Figure 1.1 – Comparing a one-table analysis to multiple-table analysis

Besides Excel, the concept also applies to other database management systems, such as Power BI, Access, Oracle, and so on.

With a data model, analyzing your data becomes easier because you can clearly define each dataset, the role it plays, and how it connects to other datasets to give you the results you need.

Comparing a one-table analysis to multiple-table analysis in Microsoft Excel

Often, we store our data in a range of cells in Microsoft Excel. Converting data stored in a range of cells into a table makes it easier for you to reference the dataset for calculations and further analysis using a PivotTable. This is called Structured Referencing. Standing in the range of cells, you can insert a table in Excel by going to Insert > Table in the ribbon or simply pressing Ctrl + T.

When data is stored in a table, simple aggregations such as SUM, AVERAGE, and COUNT can be performed using the table name and the column. For instance, summing sales from a table named Table1 can be simply done using =SUM(Table1[Sales]).

Data in the table can also be used in a PivotTable. This way, when the source data changes with the addition of more rows or columns, the PivotTables automatically update with the new data in the table when it is refreshed. This avoids the need to update the source reference of cells in the PivotTable.

Most Excel users tend to store all their data in one table for their analysis. This can be referred to as One-Table Analysis. There is nothing wrong with this approach. However, if the data you are working with grows and you have a situation where you need to add other tables to your analysis, it can become complex with just one table and a PivotTable.

Creating a data model in Power Pivot in Excel allows you to have access to multiple tables for your analysis without the need for complex lookup formulas. It improves performance and gives you a clear overview of how the tables relate.

Let’s now explore some of the key advantages of using a data model in Power Pivot.

Here are some reasons to use a data model:

  • It gives you a broad overview of your datasets or tables. This ensures that all the tables and datasets you require in your model are accurately captured. Take a look at the following example data model for a sales report.
Figure 1.2 – A Diagram view of an example sales report data model

Figure 1.2 – A Diagram view of an example sales report data model

You’ll realize that even though there are several tables used in the creation of the final dashboard, the data model gives a good overview of how each table connects and contributes to delivering the final results.

  • It is an abstract representation of the real-world situation you are analyzing. With the data model, you are in a good position to generate accurate measures and calculations for the KPIs in your report.
  • The data model helps reduce the occurrence of redundant data. That is, the repetition of the same data at different points in your dataset. This helps improve performance when your data increases.
  • The data model can also be a good blueprint for developing web or frontend applications for your dataset. For example, PowerApps, AppSheet, Caspio, and Squirrel are some of the applications that can benefit from a well-designed data model.

Most of these are low-code tools that use data models as a blueprint to create interactive apps for users. The data model then becomes an indirect way for developers to document the data that will be required to build these apps.

So far, we have covered what a data model is and the reasons you should consider using data models to structure datasets that are broken up into relational components and that need to be connected and properly visualized in order to effect the maximum efficiency and insight that is possible.

In the following section, we will look at some practical use cases of a data model. We will look at the case of an accountant and a salesperson and see how data models can help reduce the efforts and processes required in analyzing data.

Practical use cases for a data model

This section explores practical use cases of data models in various workplace scenarios.

The accountant

Mr. Owusu Yeboah is a chartered accountant. He enters his accounting records in the Journal tab, a table he has created in Microsoft Excel to record the Date, Description, Amount, Debit Account, and Credit Account of all transactions.

Figure 1.3 – Journal showing accounting entries

Figure 1.3 – Journal showing accounting entries

In another worksheet named COA, he has a table containing his chart of accounts with account codes, sorted to classify the various accounts into assets, liabilities, equity, revenue, and expenses. The other columns in his chart of accounts describe how each account has to be treated to produce a monthly and an annual financial statement.

Figure 1.4 – Sample chart of accounts

Figure 1.4 – Sample chart of accounts

For Mr. Owusu Yeboah to determine the ins and outs of each account or create a trial balance, he would need to use a lot of lookup formulas to connect the two tables. Aside from this, when new data is added to the tables, he must manually update all his workings to capture the new entries. Using Excel tables to store data is one way to avoid manually updating calculations when your data changes.

How does a data model help in this situation?

Using a data model, Mr. Owusu can upload and connect the two tables using common columns. These common columns are used to establish a relationship between the tables and make it possible to create a data model. He can then create an extra calendar table to help him create a month-on-month or annual financial statement.

A calendar table in Excel is a special table with a series of sequential dates that helps you keep track of dates and times in your data. It’s great for looking at things such as sales or expenses by day, month, or year. If your data is missing information for certain dates, a calendar table makes it easy to spot those gaps so you can fill them in. This ensures you’re not missing out on important details when making decisions.

In addition to helping Mr. Owusu Yeboah sort and analyze his data over time, a calendar table makes sure that all the date information in his various tables lines up correctly. This helps him avoid mistakes and makes it easier to combine different sets of data. It also lets Excel perform more advanced calculations for him, such as figuring out his total sales for each month or calculating averages over specific time periods.

His data model will look something like the following screenshot:

Figure 1.5 – A screenshot of a data model with accounting data

Figure 1.5 – A screenshot of a data model with accounting data

This will help him easily capture new information in the journal and chart of accounts and create a dynamic financial statement for his users.

The salesperson

Ferdinand Attobra is a sales executive with Finex online electronics shop. Daily, he is required to create a report that captures top-performing products, branches, and customers to his supervisors.

Figure 1.6 – Sales transactions

Figure 1.6 – Sales transactions

To create his report, he downloads four datasets from his sales software:

  • Transactions: This captures all the revenue as well as the cost of sales per transaction. The table also has fields that identify the customer, product, and store information related to each transaction. This is represented by Customer ID, Product ID, and Store ID.

    Apart from the Transactions table, there are three other tables he uses to look up the details of each customer, product, or store that appeared in the Transactions table.

Figure 1.7 – Sample lookup tables

Figure 1.7 – Sample lookup tables

  • Customers: This table has the unique details of all the shop’s customers’ IDs, their names, and their customer segments.
  • Products: This table contains the unique details of the product IDs, their categories, sub-categories, and their names.
  • Location: This table contains the details of each store ID, the city, region, and country.

The challenge Ferdinand faces in creating his report is how he can use the various IDs stored in the Transactions table to look up the customer, product, and store involved in each transaction.

How does a data model help in this situation?

Using a data model, Ferdi can upload and connect the Customers, Products, and Locations tables to the Transactions tables using the Customer ID, Product ID, and City columns respectively. This is where a calendar table, created as supplemental data but very useful, would get connected as well. He will then use this model to generate his daily reports to analyze sales by Product, Geography, Customer, and Date.

The model will look like the following screenshot:

Figure 1.8 – A screenshot of a data model showing sales data

Figure 1.8 – A screenshot of a data model showing sales data

From the two case studies, we can appreciate that using Excel’s data model can help us overcome some of the typical challenges in our routine office work.

Excel’s data model allows you to integrate data from multiple sources in an efficient manner. This is what is called an Entity Relationship Diagram (ERD).

Figure 1.9 – Sample ERD for a sales report in Excel

Apart from this key advantage, the data model can also do the following:

  • Store and analyze data beyond Microsoft Excel’s 1-million-row capacity. This brings a whole new capability to regular Excel.
  • Create more powerful formulas to help you analyze your data more efficiently.
  • Work together with tools such as Power Query to transform, shape your data, and maintain a dynamic connection to your data sources.

In the next topic, we will dive into the main tool for data modeling and explore some best practices to help you get more insights from your datasets.

 

Introduction to Power Pivot, Excel versions, and installation

Power Pivot is the main authoring tool for data models in Microsoft Excel.

Power Pivot allows you to load large volumes of data from various sources, perform more powerful calculations, and create insights easily from your datasets.

Power Pivot works as a downloadable add-in for the Excel 2010 and 2013 versions. Excel 2016 and more recent versions have the add-in already available in-app.

Power Pivot was inspired by Microsoft SQL Server Analysis Services (SSAS) to ultimately make self-service business intelligence possible for regular Excel users. This means a novice Excel user can still crunch key insights from datasets directly in Excel.

The key features of Power Pivot include the following:

  • An in-memory engine that can compress large datasets into smaller units making it easier to load data beyond Excel’s typical capability
  • A diagram view that makes it easy to manage relationships and create hierarchies in your data model
  • A dynamic date table feature that allows you to create automatic date dimensions for your dataset
  • A powerful calculation engine for calculations using Data Analysis Expressions (DAX), the native calculation language for Power Pivot

Now that we have a good idea about Power Pivot, we will look at where we can find and install this tool in earlier and older versions of Microsoft Excel in the next section.

 

How do I install Power Pivot?

To install or enable Power Pivot in Excel, please go through the following steps:

  1. Open a new Excel workbook and go to the Data tab:
Figure 1.10 – Enabling the Data tab in Microsoft Excel

Figure 1.10 – Enabling the Data tab in Microsoft Excel

  1. In the Data Tools group, go to the Power Pivot window:
Figure 1.11 – Enabling the Power Pivot tab in Microsoft Excel

Figure 1.11 – Enabling the Power Pivot tab in Microsoft Excel

  1. If this is the first time you are using Power Pivot, you will see the following pop-up message:
Figure 1.12 – Pop-up message while enabling Power Pivot

Figure 1.12 – Pop-up message while enabling Power Pivot

  1. Click on Enable. After a few seconds, the Power Pivot window will open to confirm that the installation was successful.
Figure 1.13 – Enabling the Power Pivot Tab in Microsoft Excel

Figure 1.13 – Enabling the Power Pivot Tab in Microsoft Excel

  1. You will find a new Power Pivot Command tab on your ribbon when the process is completed.
Figure 1.14 – Process is complete

Figure 1.14 – Process is complete

You should find the Tab present anytime you open a new workbook.

There are situations where the Power Pivot tab is not available when you open a new workbook. This could be because of low disk space or memory issues with the computer. A quick way to resolve this will be to restart your computer or create some disk space and follow the following steps:

  1. Go to File | Options | Add-ins, select COM Add-ins, and click on Go.

    This will display the following screen:

Figure 1.15 – Resetting the Power Pivot tab in Microsoft Excel

Figure 1.15 – Resetting the Power Pivot tab in Microsoft Excel

  1. Unchecking and checking the box will reset the tab and you should find it available in the Command tabs area again.

We have now installed Power Pivot. In the next section, we will take a tour to understand how we can take full advantage of some of the features of the tool for our data modeling.

Exploring the features of Power Pivot

In this section, we are going to explore some of the key features of Power Pivot. It’s important you begin learning about these features to help you use and apply them when we start working with data.

Figure 1.16 – Components of Excel’s Power Pivot

Figure 1.16 – Components of Excel’s Power Pivot

Some of the useful features of Power Pivot are described here:

  • Command tabs: Here, you will find the Home and Design tabs. The Home tab contains a group of icons for the following:
    • Formatting
    • Calculations
    • Sorting and filtering
    • Views (data and diagram view)
    • Connecting to data sources (get external data)
  • The Design tab contains icons for managing the following:
    • Columns
    • Calculations
    • Relationships
    • Creating calendars
  • Formula bar: This displays the formulas for your calculated column and measures when you select them. You can also use the field to create formulas from scratch.
  • Views: The View group under the Home tab is useful for switching between a tabular view of your datasets or a diagram view. You can also use this menu to turn off some aspects of Power Pivot.
  • Calculated Column: This area helps you to calculate and add new columns to your original datasets.
  • Calculation Area: You can create your measures and store them in this section of Power Pivot. You can turn this section off using the option in the View group.
  • The view in Power Pivot is similar to the worksheet view in Microsoft Excel. However, in Power Pivot, you can’t edit cells or create calculations by referencing cells. Calculations are done using the columnar view in the data using a formula language called DAX.

What is DAX?

Think of DAX as a more powerful version of the regular Excel formulas you might already know, such as SUM or AVERAGE. DAX allows you to do more complex things with your data, such as summing up sales for a specific time period or calculating year-over-year growth, all while working within your data model.

So, if you’re using a data model in Excel to help make sense of your business data, DAX is the tool that helps you ask specific questions and get precise answers from that model. It’s like having a super-smart calculator that can quickly crunch the numbers in different ways, helping you make better business decisions. We will go into this in detail in subsequent chapters. These calculations can result in a new dimensional column or a new measure.

Beyond understanding the features of Power Pivot, it is important to adopt some best practices when working with this tool. In the next section, we will cover some of these best practices.

 

Best practices with Power Pivot

To get the best out of your Power Pivot and data model, there are some best practices you need to adopt to ensure optimum performance. We discuss some of these best practices here:

  • Ideally, all datasets that are added to the data model should be named tables. This makes it easy to identify the tables when creating your DAX formulas.
  • Update your source data to limit the number of columns and rows you import into Power Pivot. This will improve performance and give you a better response for your calculations. You can achieve this by normalizing your data. We will discuss this in the next chapter.
  • Avoid creating calculations that shape and transform your data in Power Pivot. You can do all the data transformation and shaping in Power Query and then after, load it to Power Pivot. We will discuss Power Query in detail later in the book.
  • Use the Diagram view in View to get an overview of your datasets and how they connect to each other and the Data view to audit or explore the content of each dataset.
  • Ensure that the data type in each column is consistently formatted. For example, a column that contains dates should not have text input.

Sticking to these rules will greatly improve the performance of Power Pivot.

 

Summary

The objective of this chapter was to help you understand the concept of data modeling. We have covered the key advantages of using a data model in analyzing large and complex datasets. The chapter introduced you to tables, PivotTables, and Power Pivot and how the data model you create in Power Pivot helps you analyze data from multiple table sources. To help you put this in context, we looked at two practical use cases of a data model for an accountant and a salesperson. This should help bring the concept home and help you apply it to any dataset you analyze at work.

After reading this chapter, you are now also able to identify the key components of Power Pivot, the main authoring tool for data modeling in Microsoft Excel and Power BI. In this chapter, we also covered some best practices with a data model to help you improve the performance of Power Pivot.

In the next chapter, we will see best practices for laying out data. The chapter will help you further improve the performance of your Power Pivot calculations for large datasets.

 

Questions for discussion

  1. Name five features of Power Pivot and the role they play in data modeling.
  2. What is DAX?
  3. List the key advantages of a data model in analyzing your work.
About the Author
  • Bernard Obeng Boateng

    Bernard Obeng Boateng is a Microsoft Excel MVP, Microsoft Certified Trainer, with over 10 years work experience in Banking, Insurance and Business Development. He is founder of Finex Skills Hub an approved Training Provider of the Financial Modeling Institute, Canada. Finex Skills Hub runs the Finex Project in Ghana, a pro bono student training outreach program for students in Data Analytics and Financial Modeling. Bernard also provides consultancy services for SMEs (start-ups and existing) in Financial Management, Business Planning and Research. He has an active audience online with about 17,000 followers on his LinkedIn Account. where he shares tips and tricks on Microsoft Excel and other Office Apps.

    Browse publications by this author
Data Modeling with Microsoft Excel
Unlock this book and the full library FREE for 7 days
Start now