This chapter is designed as an introduction to tabular modeling by using PowerPivot. It shows the process by which a user imports data into PowerPivot for Excel, creates relationships between the datasets, and then reports on it.
The data used in this chapter is based on the orders of the fictitious bicycle company (named Adventure Works). Our data includes six datasets and they are:
Product list: This shows some generic information about the products being sold (for example, the name, color, and size of the product). Each product is identified by a
product_id
value.Product subcategories list: This shows a subcategory that a product belongs to. The list shows the
product_id
value and the associated subcategory (by ID and Name).Product categories list: This shows the product category that a product belongs to. The list shows the
product_id
value and the associated category (by ID and Name).Orders list: This shows what orders have been placed by customers. The list includes an entry for each product that has been ordered. This data simulates a detailed extract from an operational source system.
Customer list: This gives us information about the customer (for example, their names, countries, and states) by customer number.
Dates list: This simply lists consecutive days defining information such as the month name, year, and half-year period of the date.
The tabular modeling lifecycle revolves around three primary steps. These are:
Getting the data into the model
Defining the relationships among tables
Defining calculations based on business logic
This chapter examines these steps and allows the reader to become familiar with the tabular (PowerPivot) design environment.
An Excel workbook can only contain one tabular model and that one model contains tables of data (which may or may not be related). The first step to create a model is to import data into it. There are many techniques to do this—some techniques have advantages over others but for now, let's only consider the fact that we want to load data that exists in an Excel worksheet into the model.
Tip
The installation instructions for PowerPivot in Excel 2010 are covered in the Appendix, Installing PowerPivot and Sample Databases, of this book.
Open the Excel workbook named SalesBook
which is available from the Packt Publishing website to examine the worksheets within the book. Each sheet contains a dataset for Products
, Subcategories
, Categories
, Customers
, Dates
, and Sales
.
This recipe looks at importing data into the PowerPivot model through linked tables. These are very convenient to use when the data is stored in Excel. Additionally, once the data has been imported into PowerPivot, it retains a connection to the Excel table. This means that, when the data is changed in Excel, it can also be changed in the PowerPivot model.
Let's start by importing the product list. Select the Product List sheet and select cell A1.
Then, on the PowerPivot tab, click on the Create Linked Table button.
A small window will open confirming the data range with a checkbox for table headers. Select the checkbox and press OK.
The PowerPivot window will open and the data from the Product List sheet will be imported. Note that the table appears as a tab which is similar to Excel and is called Table1. Also, note that the PowerPivot window is a separate window than the Excel workbook, so that we can return to Excel.
Repeat this process for all the remaining datasets except
Customers
.
When a linked table is created in PowerPivot, Excel creates a named range in the Excel workbook. This is then linked to the PowerPivot model (note that there is a small chain symbol before each of the tables). Also, note that the tables in Excel are formatted with alternate blue coloring. The named ranges can be viewed in Excel by clicking on the Name Manager button on the Formulas tab.

A table (table range) is actually an Excel feature that PowerPivot utilizes. A table can be defined in Excel, given a meaningful name and then imported into PowerPivot, so that the name of the table in PowerPivot is the same as the named range in Excel.
Ensure that the Customers
sheet is selected in Excel and also any cell in the Customers
data is selected. In the Home tab, click on the Format as Table button, and choose a table style; the style chosen in the following screenshot is a relevant one:

Note that the data is now formatted with alternating colors (based on the selected style). Return to the Name Manager window and double-click the table that relates to the Customers
worksheet. A new window will open allowing you to edit the name, replace the name Table6
with Customers
, and click on OK. The Table6
name is replaced by Customers
in the Name Manager window.

Now, create a linked table in the same manner as we did before and note that the name of the table imported into PowerPivot is Customers
.
Tip
If you want to select an entire table in Excel, simply choose the table name from the Name Box drop-down list in the formula bar in the upper-left corner. This is shown in the following screenshot:

A PowerPivot workbook contains two products that allow the user to analyze data. Firstly, there is the xVelocity in-memory analytics engine (the tabular model) which is a columnar database embedded in the workbook. Secondly, there is a client tool that allows the model to be queried, it also displays the results to the user in the form of a pivot table or pivot chart. In Excel 2010, the client tool was restricted to pivot table functionality (for example, a pivot table or pivot chart). In Excel 2013, the tools set has been extended to include Power View. The important distinction here is that the client tool is used to present the model to the user. This recipe shows how to control the way the model is presented to the user.
This recipe uses the model that has already been created in the prior recipe Creating the model. If this model has not been created, follow the recipe to ensure that the model has been loaded with data.
Start from an existing model within PowerPivot.
Ensure that you are in the PowerPivot window (not Excel), then click on the PivotTable button in the Home Tab.
PowerPivot will switch back to the Excel window and a dialog will prompt for the location of the new pivot table. Select New Worksheet and click on OK.
Excel will now show PowerPivot Field List and a pivot table work area. These are identified by the arrows in the following screenshot. Note that PowerPivot Field List shows tables from the model as nodes, with the fields from the model as children.
Return to the PowerPivot application window and double-click on the
Table1
tab. The nameTable1
will be in a blue background and rename it toProducts
. Repeat this exercise for the other tables (Subcategory
,Category
,Dates
, andSales
). The table names in your PowerPivot window will now look like the following screenshot:Return to Excel and notice that PowerPivot Field List has detected a change in the model and prompts the user to refresh. Click on the Refresh button and note that the changes in the names of the tables are now reflected in the PowerPivot Field List panel.
Before
After
Return to the PowerPivot application window and select the
Products
table. Double-click on theproduct_id
field and enterProduct ID
as the new name. The field will have a blue background when its name can be changed.Return to the PowerPivot window and update the remaining columns to the following names:
Table
Column
New name
Products
product_id
Product ID
Products
product_name
Product Name
Products
colour
Colour
Products
size_range
Size Range
Products
size
Size
Subcategory
subcategory_name
Subcategory
Category
category_name
Category
Dates
date
Day
Dates
year
Year
Dates
month_name
Month
Dates
half_name
Half
Sales
order_number
SO Number
Customers
customer_id
Customer ID
Customers
customer_name
Customer Name
Customers
country_code
Country Code
Customers
state_code
State Code
Return to Excel and refresh PowerPivot Field List. The column names will now display as those that were entered.
Return to the PowerPivot window and select the
Sales
table. Right-click anywhere on theorder_number_line
field and select Hide from Client Tools in the pop-up window. Select all the fieldsproduct_id
,order_date
, andcustomer_id
by clicking-and-dragging the mouse across the three fields and hide these fields too.Tip
PowerPivot mimics Excel in the way that you can select multiple fields by dragging your mouse across several columns (with the left button continually pressed). You can also select the first column, hold the Shift key, and select the final column.
Unlike Excel, multiple columns cannot be selected by using the Ctrl key and selecting multiple fields.
Return to the PowerPivot window, refresh PowerPivot Field List, and expand the
Sales
table. Note that these fields no longer appear in the field list.Add the
Day
field to the pivot by expanding theDates
table and selecting the checkbox next to theDay
field. The column will be automatically added to the rows area of the pivot and will be displayed, as shown in the following screenshot:Return to the PowerPivot window and select the
Day
column. From the format list, select More Dates Formats…, and then select the dd-MMM-yy format from the list of available formats. The value presented will show a formatted sample of the data. Choose the item that shows 14-Mar-01.Return to Excel and refresh the pivot table. Note that the PowerPivot Field List panel may not indicate the change to the model. However, when the pivot table is refreshed, the data displays the new format.
In the
Sales
table, format the columnsunit_price
,unit_cost
,tax
, andtotal_price
as a whole numeric number by selecting the columns and choosing Currency from the Format drop-down list.
The semantic model defines the metadata structure of the model and includes information such as table names, column names, and data presentation formats. The model designer interacts with the semantic model through its presentation layer in a real-time manner (note that the model did not have to be deployed to a server), so that the changes made in the model are immediately available to the user.
The modeling environment behaves in a What You See Is What You Get (WYSIWYG) manner which means that any changes made to the design environment are reflected in the model that is presented to the user.
There are two methods that the model designer can use to examine the structure of the model. So far, we have only examined the data view. The diagram view shows all tables and columns (including hierarchies) that are used within the model and presents them on a design surface. This is shown in the next recipe.
In addition to table names, column names, and data formats, a semantic model defines how tables within a model relate to each other. This relationship is important because it defines the output of calculations (which are defined in the model). This recipe shows how to create relationships and the effect that these relationships have on the model.
This recipe assumes that the model in the recipe Managing the appearance of tables and fields has been created.
The reader should recognize that the model is designed to show sales information by product, date, and customer. This type of modeling scenario is commonly referred to as a star schema and is shown in the following diagram. The Sales
table is referred to as a fact table (since it stores the data facts that we wish to analyze—sales amount, tax amount, and so on) and the other tables are referred to as dimension (subject) tables because they hold descriptive information.

Extending the model further, the Products
table is linked to the Subcategory
table, and the Subcategory
table is linked to the Category
table. This is shown in the following diagram and is sometimes called a snowflake schema, since the dimension tables are not directly connected to the fact table:

An important point to note, is that each dimension table has a unique identifying field, for example, a product can be uniquely identified in the Products
table through the product_id
field. This is commonly referred to as the primary key for the table.
In contrast, the referring column (product_id
in the Sales
table) can have many occurrences of the product_id
field and is commonly referred to as the foreign key.
Start with the workbook that was developed in the prior recipe.
Drag the
Product Name
field onto the rows of the pivot table (under theRow Labels
column) and the Sales columntotal_price
onto values. Your screen should look like the following screenshot:Return to the PowerPivot window and select the
product_id
field and then click on the Create Relationship button (in the Design tab). A new window will open asking you to define the related (lookup) table and column.Select the Products option from the Related Lookup Table drop-down list and
Product ID
from the Related Lookup Column drop-down list. The Create button is now enabled. Click on Create.Return to the pivot table and refresh the model. The values for the Sum of total_price field have updated to reflect the total for each product.
Now, create relationships between the following tables and columns:
Source table
Source column
Related table
Related column
Sales
customer_id
Customers
Customer ID
Sales
order_date
Dates
Day
Tip
Downloading the example code
You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
In the PowerPivot window, click on the Manage Relationships button. A new window will open showing all the relationships that have been built in the model.
Click on the Diagram View button in the Home menu to switch to the diagram modeling view. Your screen will now show tables and columns (rather than data) and look like the following screenshot. Note that previously defined relationships appear as lines that connect tables.
Tip
You can also switch between the data and diagram views by toggling the two buttons at the bottom-right side of the PowerPivot application status bar.
In the pivot table, replace the
Product Name
field with theCategory
field from thecategory
table. The total value ($1,662,869.42
) is repeated for all categories indicating that there is no relationship between theSales
table andCategory
table.From the
Products
table, select theProduct ID
field, and drag it to theproduct_id
field of theSubcateogry
table. A new relationship will be created betweenProducts
andSubcategory
. Use this method to create a relationship between theProducts
table and theCategory
table. Refresh the pivot table to ensure that the total value is not duplicated.In the pivot table, drag the
Product ID
field from theProducts
table into the Values pane. The pivot table will now show the price and number of products for each category and will look like the following screenshot:

The model has been extended to show two things. Firstly, by defining relationships between tables within the model, we have defined the filtering path for the data. This path is used to restrict rows between tables that have a relationship. Secondly, by adding a calculation (Sum of total_price and Count of Product ID), we have created measures that apply an aggregation function to the model fields. These are special types of measures within PowerPivot and are referred to as implicit measures (because the model implicitly defines a calculation for the field).
Relationships define how one table relates to another. In order to define a relationship, the join must occur on a field that has unique values in one of the tables (this is commonly called a primary key). The table that has the field with unique values is commonly called the related table. This can be seen in the diagram view, as shown in the following screenshot with the direction of the arrows on the relationships. Consider the Products
table (which has a unique field product_id
) that is related to the Sales
table (through the product_id
field in that table), but only the Products
table needs to have a unique product_id
. It is also said that the product_id
field relates to many records in the Sales
table. This can be seen by the direction of the arrow between Sales
and Products
, the related table has the arrow pointing towards it.
Relationships are important because they define how data is filtered and calculated when it is presented to the user.

Relationships are the primary mechanisms with the model that are used to filter data and perform calculations. That is, the relationship defines how data is filtered when values are shown to the user. Although this is a new concept, the concept of relationships is important because they have important implications with the way that the model determines what data to show to the user. Consider the pivot table shown in the following screenshot—Subcategory
on rows and Sum of total_price
, Count of Product ID
, and Count of category_id
as measures:

Now, consider the relationship defined in the model. This is summarized in the following screenshot:

The rows in the pivot show the subcategory which defines a filter for each row (that is a filter for each subcategory). This filter can then be applied to the Products
table, which in turn is applied to the Sales
table. It might be better to say that the rows of the Sales
table are filtered by the Products
table and then those rows are filtered by the Subcategory
table. This is why the calculations Sum of total_price
and Count of Product ID
show the correct values. The filter on rows of the Sales
table and rows of the Products
table can be applied in the direction of the arrows of the relationships.
However, this is not the case when Subcategory
is shown with data from the Category
table—a filter will only be applied in the direction that a relationship is created. This is why the calculation Count of category_id
shows the same number for each subcategory. With the subcategory on rows, a filter is created which can filter the Products
table but this filter cannot then applied in an upstream manner to the Category
table.
The application of filters may seem unintuitive at first, especially with a relationship design such as the one among Products
, Category
, and Subcategory
, but in reality the model should be designed so that the filters can be applied in a single direction. There is also, the question of unmatched values between fields used in the relationship and how they are treated by the model. For example, what would happen if we had a product_id
field in the Sales
table that did not have a match in the Products
table? Would this even be allowed in the model? The tabular model handles this situation very elegantly. The model allows this situation (without error), and unmatched values are assigned to a blank placeholder. For example, if there was a product in the Sales
table and no product in the Products
table, it would be shown as blank when Products
, Category
, or Subcategory
is used in the pivot.
We have also indicated that the model automatically created implicit measures. The term measure is common in business intelligence tools to specify that a calculated value is returned. Often, this can be a simple calculation, for example, the count of rows or the sum of a field. The important thing to remember is that measure is a single value that is returned from the model (when the model is filtered). Usually, measures are defined by a model designer, but they need not be. This is the case with an implicit measure. An implicit measure is defined automatically, depending on the data type of the column that is being used. Numeric columns are automatically summed, whereas text columns are automatically counted.
The aggregation function of an implicit measure is initially set by the underlying data type. However, the user can change this within the pivot table by editing the measure in the pivot table. This can be done in the following manner:
Alter the pivot table so that it shows subcategory on rows and
total_profit
as values. By default, the measure will showSum of total_profit
. Right-click on the measure and select Edit Measure… from the pop-up window.A new window will open, displaying the aggregation function with the measure. Select Average from the function list and change the measure name to
Average Profit
. The Measure Settings window should look like the following screenshot:Also, notice that a formula is used to define the measure, for
Average Profit
, the formula is=AVERAGE('Sales'[total_profit])
. Click on the OK button and note that the pivot table now contains the new measureAverage Profit
.
Implicit measures that have been created in the model can be seen by exposing the measures in the Advanced tab of the PowerPivot window (the Advanced tab must be activated). This is shown in the following screenshot:

The model designer is often required to add additional fields to tables, so that the information presented to the user is better suited for decision-making purposes. This can include creating new fields that are combinations of other fields within the same table or a calculation that is dependent on data in another table. This recipe looks at the first of these options to create new fields that use other fields within the same table.
The model used in this recipe starts with the model that was created in the previous recipe Using tabular relationships to filter data.
Switch to the data view in the PowerPivot window and select the
Products
table. Select theColour
column by right-clicking on the column header and selecting Insert Column from the pop-up menu (note that the entire column must be selected). The new column is inserted to the left of theColour
column. Change the name of the CalculatedColumn1 toProduct Name WC
(product name with code).Enter the following formula into any cell of the new column.
=[Product Name]&" (" & [Product ID] & ")"
All rows of the table will be automatically populated.
Switch to the
Sales
table. Double-click on the header row of the last column (the current header is Add Column) and change the name of the column tototal_profit
. Enter the following formula into any cell of the Profit column with the format of the column as currency.=[total_price]-[unit_cost]-[tax]
Tip
The designer has two built-in functions that enable the easy creation of formulas. If the formula is being typed, an intellisense window will open in the formula bar, and show a list of objects that match what is being typed. Simply navigate to the desired column (or cell in the measure grid) and start typing, then press return to use the provided intellisense option (you can use arrow keys to select a function, table and column). Alternatively, a column or table name can be included in the formula by clicking on the column or table while the formula is being typed.
This recipe introduces Data Analysis Expressions (DAX) as the language that is used in tabular modeling. From this recipe, we can see that the DAX language is very similar to an Excel calculation (there are some noticeable differences which are addressed in chapters). Also, note that in DAX, columns are referred to instead of cells. Furthermore, many Excel functions work exactly the same in DAX as they do in Excel.
In calculating the value for each row, a special filter is applied in the calculation. In these examples where the fields being used in the formula reside on a single row, the filter automatically restricts the value to that of the row. The application of filtering in this manner is commonly referred to as a row filter or a row filter context.
There may be a requirement to create fields in a table that contain data from a separate table. In Excel, this would usually be achieved with a VLOOKUP
function.
The sales model that has been developed in this chapter contains three tables which define Products
, Subcategory
, and Category
. When the user browses the model in a pivot table, each of these tables appear as tables in the PowerPivot Field List pane. However, in this model, the category and subcategory directly relate to the product and it is our intent to show these fields in the Products
table.
This recipe assumes that the sales model created in the Adding fields to tables recipe is available and that the appropriate relationships exist among the Product
, Subcategory
, and Category
tables.
Start by opening the PowerPivot window and then perform the following steps:
Switch to the data view and create two new columns in the
Products
table titledCategory
andSubcategory
. In theCategory
column enter the following formula:=RELATED(Category[Category])
In the
Subcategory
column enter the following formula:=LOOKUPVALUE (Subcategory[Subcategory] , Subcategory[product_id],Products[Product ID] )
Hide the Subcategory
and Category
tables in the model by right-clicking on the tables tab and selecting Hide from Client Tools from the pop-up menu. Note that the hidden tables are still visible in the data view and diagram view, although they are now more transparent.
These two formulas achieve the same result but in different ways.
The related
function returns the specified column, based on the relationship within the data model. This can span more than one table (for example, a related table to the Category
table could be referenced from the Products
table), however, a relationship must be defined between all the linking tables that are spanned by the formula. Furthermore, because the formula relies on these relationships (that is, those defined within the model), the formula will not result in an error since the model enforces the integrity defined by model relationships.
The LOOKUPVALUE
function is quite different from the related
function because it does not utilize or rely on a relationship within the model. That is, LOOKUPVALUE
would still return the same results had the relationship not be defined between the Products
and Subcategory
tables. Furthermore, the LOOKUPVALUE
function can use multiple columns as its reference (to lookup) which may be beneficial when a desired value in another table cannot be related to the source data through a single field. Note that relationships can only be defined on single columns. However, unlike the RELATED
function, the LOOKUPVALUE
function may return an error when more than one match can be found in the lookup table.
Both formulas return results by creating a row context filter for each row in the source table.
It is considered best to utilize the relationship wherever possible. Therefore, the use of the RELATED
function is preferred over the LOOKUPVALUE
function. Furthermore, the RELATED
function makes the model simpler for others to understand. However, the LOOKUPVALUE
function does have some benefits. It allows the value to be determined, based on multiple search conditions. The syntax for LOOKUPVALUE
is defined as:
LOOKUPVALUE( <result_columnName> , <search_columnName>, <search_value> [, <search_columnName>, <search_value>] …)
Here, a
result_columnName
column is returned from a target table where search conditions are satisfied. These conditions are defined by a search_columnName
parameter and a search_value
parameter. This means that we specify the column (in the lookup table) and the value that should be searched for—this is the field in the current table.
The sales model that has been developed in this chapter allows the user to interrogate data from the order list by products, customers, and dates. In doing so, the user can create an implicit measure so that the underlying data is aggregated according to the current filter context. The aggregation function of implicit measures is determined by the underlying data type of the column that is used for the measure. This method offers the user the ability to create and show simple calculations from the model data. However, it does not create a robust model because the calculations aren't readily selectable by the user and the calculation definition is not conformed within the model.
This recipe introduces calculations which are contained within the model and presented to the user as measures. DAX (the tabular model language introduced in Adding fields to tables recipe) is used to define measures, so that it can explicitly use these measures in the model. Furthermore, the ability to create measures through a complex DAX allows the model designer a larger degree of flexibility than is involved with implicit measures.
This recipe assumes that the sales model created in Linking fields between tables recipe is available and that the appropriate relationships exist among the Product
, Subcategory
, and Category
tables.
By default, a table in Data View will have a horizontal line that does not show any data. This is referred to as the calculation area. If this is not visible, ensure that the Calculation Area button is selected in the Home menu.

There are many ways to create simple measures. Let's start with the automatic creation of measures.
Select any cell in the
order_quantity
field. Then, from the ribbon, select the SUM function from the AutoSum drop-down. A new calculation will be created in the calculation area as:Sum of order_quantity:=SUM([order_quantity])
In the formula bar, select the name of the measure (
Sum of order_quantity
) and rename it toTotal Order Quantity
. The calculation should now look like this:Total Order Quantity:=SUM([order_quantity])
Right-click on the calculation, select Format Cells... from the pop-up menu, and specify the format as a number format—(decimal number) with zero decimal places and click on the User 1000 separator (,) checkbox.
Select this group of columns:
unit_cost
,tax
,total_price
, andtotal_profit
, by selecting theunit_cost
column and dragging the mouse across to thetotal_profit
column.Click on the AutoSum button to create your calculations for these columns.
Rename the measures to
Total Cost
,Total Tax
,Total Price
, andTotal Profit
respectively with the same format asTotal Order Quantity
. Note that the format of the number is inherited as currency.Select the cell in the calculation area under the
unit_price
column and type the formula:Average Price:=AVERAGE([unit_price])
Intellisence provides a list of formulas available (based on the expression that is entered) and a description of the function, as shown in the following screenshot:
Hide the following fields of the
Sales
table:order_quantity
,unit_price
,tax
,total_price
, andtotal_profit
.Return to the PowerPivot Field List pane (in Excel) and refresh the model. Existing implicit calculations that were based on existing fields are removed from the model and the
Sales
table now includes the additional measures which were created. These measures can now be placed in the pivot tables (value) field list.
Measures that are created in the calculation area operate in the same manner as implicit measures. That is, the aggregation function is applied to the filtered data specified by the row and filter context within the client tool.
Because the measures are explicitly defined in the model they are called explicit measures and interpreted by the client tool as measures (many client tools detect an explicit measure as a special type of field and treat it differently than a table's standard field or dimension field).
By default, the tabular model created in PowerPivot will display the model through a PowerPivot pivot table. This shows the model in its tabular form where measures and columns are shown with respect to the tables that they relate to. For example, the measures created in the Creating model calculations recipe appear under the Sales
table node. In order to compare this (tabular) view with that of a traditional OLAP client, simply do the following:
The client tool (PivotTable Field List) shows the model in a different format and represents how a client tool interpretation of the model would be shown. Here, the measures are shown in measure groups and are not included as table objects.