Reader small image

You're reading from  Data Modeling with Microsoft Excel

Product typeBook
Published inNov 2023
PublisherPackt
ISBN-139781803240282
Edition1st Edition
Right arrow
Author (1)
Bernard Obeng Boateng
Bernard Obeng Boateng
author image
Bernard Obeng Boateng

Bernard Obeng Boateng is a Microsoft Excel MVP, Microsoft Certified Trainer, with over 10 years work experience in Banking, Insurance and Business Development. He is founder of Finex Skills Hub an approved Training Provider of the Financial Modeling Institute, Canada. Finex Skills Hub runs the Finex Project in Ghana, a pro bono student training outreach program for students in Data Analytics and Financial Modeling. Bernard also provides consultancy services for SMEs (start-ups and existing) in Financial Management, Business Planning and Research. He has an active audience online with about 17,000 followers on his LinkedIn Account. where he shares tips and tricks on Microsoft Excel and other Office Apps.
Read more about Bernard Obeng Boateng

Right arrow

Preparing Your Data for the Data Model – Cleaning and Transforming Your Data Using Power Query

Having understood the best practices for data structuring in our earlier chapters, we will take a look at how to prepare our data for data modeling using Power Query—Microsoft Excel’s data cleaning and transformation tool.

In this chapter, you will understand the architecture of the Power Query editor and data types and how to use the Transform and Add Column tabs to correct any issues with the sample sales data. You will also learn how to append and merge data from our sample sales data.

The following topics will be covered in this chapter:

  • Understanding queries and connections
  • An overview of the Power Query editor
  • Getting your data type right
  • Add Column or Transform?
  • Merging and appending data using Power Query

Understanding connections and queries

Before we begin, it is important for us to understand the typical data analysis process in Microsoft Excel and the tools you can use at each stage. Data modeling, which is the main topic of this book, is part of the process and can be done with the Power Pivot tool.

However, before data gets to the data modeling stage, we may have to shape and transform it to make it easier to analyze. Not all data comes in a format ready for the data model. There are situations where you need to remove unwanted characters, split or merge columns, transpose the data, or append or merge it to make it fit to be loaded into the data model.

Microsoft Excel has a dedicated tool for this called Power Query. Power Query is Excel’s data transformation and preparation engine. The tool allows you to connect to many data sources (including CSV files, the web, folders, and databases) and shape and transform the data and load it in different forms for further analysis...

An overview of the Power Query editor

We will use our sales data case study to understand the various components of the Power Query editor.

The files we are going to use are in a folder that can be accessed using this link: https://tinyurl.com/DMEFINEX

Use this link to download a zipped file that contains one folder and five Excel worksheets as shown here:

Figure 3.3 – The list of files in a zipped file

Figure 3.3 – The list of files in a zipped file

Extract the files to get access to the files.

When you open the folder, you should see a folder named Main Transaction that contains six files. These files are sales records from 2015 to 2020 for our sales organization, Finex Ventures.

There are five other files outside the folder, which contain information on the following:

  • Customers
  • Locations
  • Products
  • Stores
  • Product returns

As we covered in an earlier chapter, this is normalized data. This data structure separates the main activities of the business from...

Getting your data type right

When you create your query, one of the most important steps in the transformation process is to ensure that you have the correct data type for each column. At the top-left corner of each column, there is an icon that allows you to change the data type. The following data types are available:

Figure 3.8 – List of data types in Power Query

Figure 3.8 – List of data types in Power Query

Based on the settings in your Power Query, some data transformations are automatically done for some columns. When this happens, Power Query records this as a step in the Applied Steps list of the editor. An example of this is shown in the following figure:

Figure 3.9 – Automatic transformation using Changed Type

Figure 3.9 – Automatic transformation using Changed Type

In this example query, all our data types have been applied correctly apart from Birthdate. The correct format for the content of that column should be a date.

To do this, we do the following:

  1. Select the format icon on the column...

Add Column or Transform?

Before we bring in our sales data, let us go back to our customer data query for one more transformation. This example will help us understand the key differences between performing a task with the Transform and Add Column tabs. In the customers query, we have the names of our customers in two columns. We want to merge these names into one column. Let’s do this from the Add Column tab to see the results we will get.

To do this, follow these steps:

  1. Go to the Add Column tab.
  2. Select the two columns First Name and Last Name.
  3. Click on Merge Columns in the Add Column tab.
Figure 3.18 – Merging columns under the Add Column tab

Figure 3.18 – Merging columns under the Add Column tab

  1. This brings up a dialog box that requires a separator and a name for our new column.
Figure 3.19 – Selecting a separator in Merge Columns

Figure 3.19 – Selecting a separator in Merge Columns

You can select Space for the separator and Full Name for the new column name.

When you click...

Merging and appending data using Power Query

In Power Query, there are two main methods to combine data: merging and appending. When you combine to add more rows, you are appending. This is typically in situations where the tables you are combining have the same columns. When you combine to add extra columns to an existing query, you are merging. This is typically in situations where the two tables have at least one common column. The following figures summarize the difference between the two:

Figure 3.23 – Appending queries

Figure 3.23 – Appending queries

Figure 3.24 – Merging queries

Figure 3.24 – Merging queries

With this understanding, let’s proceed and append our sales records into one query:

  1. Because our data is stored in a folder, this time round, we will select Folder when we go to New Source.
Figure 3.25 – Getting data from a folder

Figure 3.25 – Getting data from a folder

  1. Select the folder from where it has been saved on your computer. Remember...

Summary

The chapter introduced you to Power Query, the data transformation tool for extracting, transforming, and loading data to the data model. We now understand the architecture and components of Power Query. In this chapter, we used a sample case study to perform a number of transformations in the Power Query editor. We concluded the chapter by downloading all our queries into the data model.

In the next chapter, we will learn all that we can do with the data model in Power Pivot.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Data Modeling with Microsoft Excel
Published in: Nov 2023Publisher: PacktISBN-13: 9781803240282
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Bernard Obeng Boateng

Bernard Obeng Boateng is a Microsoft Excel MVP, Microsoft Certified Trainer, with over 10 years work experience in Banking, Insurance and Business Development. He is founder of Finex Skills Hub an approved Training Provider of the Financial Modeling Institute, Canada. Finex Skills Hub runs the Finex Project in Ghana, a pro bono student training outreach program for students in Data Analytics and Financial Modeling. Bernard also provides consultancy services for SMEs (start-ups and existing) in Financial Management, Business Planning and Research. He has an active audience online with about 17,000 followers on his LinkedIn Account. where he shares tips and tricks on Microsoft Excel and other Office Apps.
Read more about Bernard Obeng Boateng