MDX with SSAS 2012 Cookbook

3.8 (4 reviews total)
By Sherry Li , Tomislav Piasevoli
    Advance your knowledge in tech with a Packt subscription

  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Elementary MDX Techniques

About this book

MDX is the BI industry standard for multidimensional calculations and queries. Proficiency with this language is essential for the realization of your Analysis Services’ full potential. MDX is an elegant and powerful language, and also has a steep learning curve.

SQL Server 2012 Analysis Services has introduced a new BISM tabular model and a new formula language, Data Analysis Expressions (DAX). However, for the multi-dimensional model, MDX is still the only query and expression language. For many product developers and report developers, MDX is the preferred language for both the tabular model and multi-dimensional model.

MDX with SSAS 2012 Cookbook is a must-have book for anyone who wants to be proficient in the MDX language and to enhance their business intelligence solutions.

MDX with SSAS 2012 Cookbook is packed with immediately usable, practical solutions. It starts with elementary techniques that lay the foundation for designing advanced MDX calculations and queries. The discussions after each solution will provide you with a solid foundation and best practices. It covers a broad range of real-world topics and solutions and provides you with learning materials to become proficient in the language.

This book will guide you through the hands-on and practical MDX solutions, best practices, and many intricacies that hide within the MDX calculations and queries.

We will start by working with sets, creating time-aware, context-aware calculations, and business analytics solutions, through to the techniques of enhancing the cube design when MDX is not enough. We will then move on to capturing MDX generated by SSAS front-ends and using SSAS stored procedures, and we will explore the whole range of MDX solutions for real-world BI projects.

Publication date:
August 2013
Publisher
Packt
Pages
420
ISBN
9781849689601

 

Chapter 1. Elementary MDX Techniques

In this chapter, we will cover:

  • Putting data on x and y axes

  • Skipping axes

  • Using a WHERE clause to filter the data returned

  • Optimizing MDX queries using the NONEMPTY() function

  • Using the PROPERTIES() function to retrieve data from attribute relationships

  • Basic sorting and ranking

  • Handling division by zero errors

  • Setting a default member of a hierarchy in the MDX script

 

Introduction


MDX is an elegant and powerful language, and also has a steep learning curve.

The goal of this chapter is to use some simple examples to demonstrate the fundamental MDX concepts, features and techniques that are the foundations for further explorations of the MDX language.

The chapter begins with several basic techniques: putting multi-dimensional data onto query axes, cube space restriction, empty cell removal, and the important concept of unique names for members, tuples, and sets. From there, we shall turn our attention to a few more advanced features, such as using the MDX functions, creating calculations in the cube space, manipulating strings, writing parameterized queries, and conditionally formatting cell properties. This will form the basis for the rest of the chapters in this book.

SSAS 2012 provides a sample Analysis Services database, the Multidimensional Adventure Works DW. All the MDX queries and scripts in this book have been updated for Analysis Services 2012, and verified against the 2012 Enterprise Edition of the Adventure Works DW Analysis Services database. Majority of the MDX queries and scripts should also run and have been tested in SSAS 2008 R2.

The Query Editor in SQL Server Management Studio (SSMS) is our choice of writing and testing MDX queries. The SQL Server 2012 comes with a free tool, SQL Server Data Tools (SSDT) for cube developers. Just as the Business Intelligence Development Studio (BIDS) was the tool that we used for cube design and MDX scripting in SSAS 2008, SSDT is the tool we will use in this cookbook for cube design and MDX scripting for SSAS 2012.

 

Putting data on x and y axes


Cube space in SSAS is multi-dimensional. MDX allows you to display results on axes from 0, 1, and 2 up to 128. The first five axes have aliases: COLUMNS, ROWS, PAGES, SECTIONS, and CHAPTERS. However, the frontend tools such as SQL Server Management Studio (SSMS) or other application that you can use for writing and executing MDX queries only have two axes, x and y axis, or COLUMNS and ROWS.

As a result, we have two tasks to do when trying to fit the multi-dimensional data onto the limited axes in our frontend tool:

  • We must always explicitly specify a display axis for all elements in the SELECT list. We can use aliases for the first five axes: COLUMNS, ROWS, PAGES, SECTIONS, and CHAPTERS. We are also allowed to use integers, 0, 1, 2, 3, and so on. But we are not allowed to skip axes. For example, the first axis must be COLUMNS (or 0). ROWS (or 1) cannot be specified, unless COLUMNS (or 0) has been specified first.

  • Since we only have two display axes to show our data, we must be able to "combine" multiple hierarchies into one query axis. In MDX and other query language terms, we call it "cross join".

It's fair to say that your job of writing the MDX queries is mostly trying to figure out how to project multi-dimensional data onto only two axes, namely, x and y. We will start by putting only one hierarchy on COLUMNS, and one on ROWS. Then we will use the CROSSJOIN function to "combine" more than one hierarchy into COLUMNS and ROWS.

Getting ready

Making a two by eight table below in a spreadsheet is quite simple. Writing a MDX query to do that can also be very simple. Putting data on the x and y axes is a matter of finding the right expressions for each axis.

 

Internet Sales Amount

Australia

$9,061,000.58

Canada

$1,977,844.86

France

$2,644,017.71

Germany

$2,894,312.34

NA

(null)

United Kingdom

$3,391,712.21

United States

$9,389,789.51

All we need are three things from our cube:

  • The name of the cube

  • The correct expression for the Internet Sales Amount, so we can put it on the columns

  • The correct expression of the sales territory, so we can put it on the rows

Once, we have the preceding three things, we are ready to plug them into the following MDX query, and the cube will give us back the two by eight table:

SELECT
   [The Sales Expression] ON COLUMNS,
   [The Territory Expression] ON ROWS
FROM
   [The Cube Name]

The MDX engine will understand it perfectly, if we replace columns by 0 and rows by 1. Throughout this book, we will use the number 0 for columns that is the x axis, and 1 for rows that is the y axis.

How to do it…

We are going to use the Adventure Works 2012 Multidimensional Analysis Service database enterprise edition in our cookbook. If you open the Adventure Works cube, and hover your cursor over the measure Internet Sales Amount, you will see the fully qualified expression, [Measures].[Internet Sales Amount]. This is a long expression. Drag-and-drop in SQL Server Management Studio works perfectly for us in this situation.

Tip

Long expression is a fact of life in MDX. Although the case does not matter, correct spelling is required and fully qualified and unique expressions are recommended for MDX queries to work properly.

Follow these two steps to open the Query Editor in SSMS:

  1. Start SQL Server Management Studio (SSMS) and connect to your SQL Server Analysis Services (SSAS) 2012 instance (localhost or servername\instancename).

  2. Click on the target database Adventure Works DW 2012, and then right-click on the New Query button.

Follow these steps to save the time spent for typing the long expressions:

  1. Put your cursor on measure Internet Sales Amount, and drag-and-drop it onto AXIS(0).

  2. To get the proper expression for the sales territory, put your cursor over the Sales Territory Country under the Sales Territory | Sales Territory Country. Again, this is a long expression. Drag-and-drop it onto AXIS(1).

  3. For the name of the cube, the drag-and-drop should work too. Just point your cursor to the cube name, and drag-and-drop it in your FROM clause.

This should be your final query:

SELECT
   [Measures].[Internet Sales Amount] ON 0,
   [Sales Territory].[Sales Territory Country].[Sales Territory Country] ON 1
FROM
   [Adventure Works]

Tip

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

When you execute the query, you should get a two by eight table, same as the following screenshot:

How it works…

We have chosen to put Internet Sales Amount on the Axis(0), and all members of Sales Territory Country on the Axis(1). We have fully qualified the measure with the special dimension [Measures], and the sales territory members with dimension [Sales Territory] and hierarchy [Sales Territory Country].

You might have expected an aggregate function such as SUM somewhere in the query. We do not need to have any aggregate function here because the cube understands that when we ask for the sales amount for Canada, we would expect the sales amount to come from all the provinces and territories in Canada.

There's more…

SSAS cubes are perfectly capable of storing data in more than two dimensions. In MDX, we can use the technique called "cross join" to "combine" multiple hierarchies into one query axis.

Putting more hierarchies on x and y axes with cross join

In MDX query, we can specify how multi-dimensions from our SSAS cube layout onto only two x and y axes. Cross joining allows us in both SQL and MDX to get every possible combination of two lists.

We wish to write an MDX query to produce the following table. On the columns axis, we want to see both Internet Sales Amount and Internet Gross Profit. On the rows axis, we want to see all the sales territory countries, and all the products sold in each country.

 

Internet Sales Amount

Internet Gross Profit

Australia

Accessories

$138,690.63

$86,820.10

Australia

Bikes

$8,852,050.00

$3,572,267.29

Australia

Clothing

$70,259.95

$26,767.68

Australia

Components

(null)

(null)

Canada

Accessories

$103,377.85

$64,714.37

Canada

Bikes

$1,821,302.39

$741,451.22

Canada

Clothing

$53,164.62

$23,755.91

Canada

Components

(null)

(null)

This query lays two measures on columns (from the same dimension and hierarchy [Measures]), and two different hierarchies [Sales Territory Country] and [Product Categories] on rows.

SELECT
   { [Measures].[Internet Sales Amount],
     [Measures].[Internet Gross Profit] 
   } ON 0,
   { [Sales Territory].[Sales Territory Country].[Sales Territory 
     Country] *
     [Product].[Product Categories].[Category]
   } ON 1
FROM
   [Adventure Works]

To return the cross product of two sets, we can use either of the following two syntaxes:

Standard syntax: Crossjoin(Set_Expression1, Set_Expression2)
Alternate syntax: Set_Expression1 * Set_Expression2

We have chosen to use the alternate syntax for its convenience. The result from the previous query is shown as follows:

 

Skipping axes


There are situations when we want to display just a list of members with no data associated with them. Naturally, we expect to get that list in rows, so that we can scroll through them vertically instead of horizontally. However, the rules of MDX say that we can't skip the axes. If we want something on rows (which is AXIS(1) by the way), we must use all previous axes as well (columns in this case, which is also known as AXIS(0)).

The reason why we want the list to appear on axis 1 and not axis 0 is because a horizontal list is not as easy to read as a vertical one.

Is there a way to display those members on rows and have nothing on columns? Sure! This recipe shows how.

Getting ready

Notation for empty set is this: {}. So for the axis 0, we would simply do this:

{ } ON 0

How to do it…

Follow these steps to open the Query Editor in SQL Server Management Studio (SSMS):

  1. Start SQL Server Management Studio (SSMS) and connect to your SQL Server Analysis Services (SSAS) 2012 instance.

  2. Click on the target database Adventure Works DW 2012, and then right-click on the New Query button.

Follow these steps to get a one-dimensional query result with members on rows:

  1. Put an empty set on columns (AXIS(0)). Notation for the empty set is this: {}.

  2. Put some hierarchy on rows (AXIS(1)). In this case we used the largest hierarchy available in this cube – customer hierarchy of the same dimension.

  3. Run the following query:

    SELECT
       { } ON 0,
       { [Customer].[Customer].[Customer].MEMBERS } ON 1
    FROM
       [Adventure Works]

How it works…

Although we can't skip axes, we are allowed to provide an empty set on them. This trick allows us to get what we need – nothing on columns and a set of members on rows.

There's more…

Skipping the Axis(0) is a common technique to create a list for report parameters. If we want to create a list of customers whose name contains "John", we can modify the preceding base query to use two functions to get only those customers whose name contains the phrase John. These two functions are Filter() and InStr():

SELECT
   { } ON 0,
   { Filter(
           [Customer].[Customer].[Customer].MEMBERS,
           InStr(
                 [Customer].[Customer].CurrentMember.Name,
                'John'
               ) > 0
         )
   } ON 1
FROM
   [Adventure Works]

In the final result, you will notice the "John" phrase in various positions in member names:

The idea behind it

Instead of skipping the Axis(0), if you put a cube measure or a calculated measure with a non-constant expression on axis 0, you'll slow down the query. The slower query time can be noticeable, if there are a large number of members from the specified hierarchy. For example, if you put the Sales Amount measure on axis 0, the Sales Amount will have to be evaluated for each member in the rows. Do we need the Sales Amount? No, we don't. The only thing we need is a list of members; hence we've used an empty set {} on axis 0. That way, the SSAS engine doesn't have to go into cube space to evaluate the sales amount for every customer. The SSAS engine will only reside in dimension space, which is much smaller, and the query is therefore more efficient.

Possible workarounds – dummy column

Some client applications might have issues with the MDX statement skipping axes because they expect something on columns, and will not work with an empty set on axis 0. In this case, we can define a constant measure (a measure returning null, 0, 1 or any other constant) and place it on columns. In MDX's terms, this constant measure is a calculated measure. It will act as a dummy column. It might not be as efficient as an empty set, but it is a much better solution than the one with a regular (non-constant) cube measure like the Sales Amount measure.

This query creates a dummy value NULL on columns:

WITH
MEMBER [Measures].[Dummy] AS NULL

SELECT
   { [Measures].[Dummy] } ON 0,
   { [Customer].[Customer].[Customer].MEMBERS } ON 1
FROM
   [Adventure Works]
 

Using a WHERE clause to filter the data returned


A WHERE clause in MDX works in a similar way as the other query languages. It acts as a filter and restricts the data returned in the result set.

Not surprisingly, however, the WHERE clause in MDX does more than just restricting the result set. It also establishes the "query context".

Getting ready

The MDX WHERE clause points to a specific intersection of cube space. We use tuple expressions to represent cells in cube space. Each tuple is made of one member, and only one member, from each hierarchy.

The following tuple points to one year, 2008 and one measure, the [Internet Sales Amount]:

( [Measures].[Internet Sales Amount],
  [Date].[Calendar Year].&[2008]
)

Using a tuple in an MDX WHERE clause is called "slicing" the cube. This feature gives the WHERE clause another name, slicer. If we put the previous tuple in the WHERE clause, in MDX terms, we are saying, "show me some data from the cube sliced by sales and the year 2008".

That is what we are going to do next.

How to do it…

Open the Query Editor in SSMS, and then follow these steps to write a query with a slicer and test it:

  1. Copy this initial query into the Query Editor and run the query. You will see the following result:

    SELECT
       { [Customer].[Customer Geography].[Country]
       } ON 0,
       { [Product].[Product Categories].[Category] } ON 1
    FROM
       [Adventure Works]
  2. At this point, we should ask the question, "What are the cell values?" The cell values are actually the [Measures].[Reseller Sales Amount], which is the default member on the Measures dimension.

  3. Add the previous tuple to the query as a slicer. Here is the final query:

    SELECT
       { [Customer].[Customer Geography].[Country]
       } ON 0,
       { [Product].[Product Categories].[Category] } ON 1
    FROM
       [Adventure Works]
    WHERE
       ( [Measures].[Internet Sales Amount],
         [Date].[Calendar Year].&[2008]
       )
  4. The result should be as shown in the following screenshot:

  5. Ask the question again, "What are the cell values?" The cell values are now the [Measures].[Internet Sales Amount], and no longer the default measure.

How it works…

We can slice the data by pointing to a specific intersection of cube space. We can achieve this by putting a tuple in the WHERE clause.

In the preceding example, the cube space is sliced by sales and year 2008. The cell values are the Internet Sales Amount for each country and each product category, sliced by year 2008.

There's more…

Notice that the data returned on the query axes can be completely different from the tuple in the WHERE clause. The tuples in the slicer will only affect the cell values in the intersection of rows and columns, not what are on the columns or rows axes.

If you need to display sales and year 2008 on the query axes, you would need to move them to the query axes, and not in the WHERE clause.

This query has moved the sales to the columns axis, and the year 2008 to the rows axis. They both are "crossjoined" to the original hierarchies on the two query axes:

SELECT
   { [Measures].[Internet Sales Amount] *
     [Customer].[Customer Geography].[Country]
   } ON 0,
   { [Date].[Calendar Year].&[2008] *
     [Product].[Product Categories].[Category]
   } ON 1
FROM
   [Adventure Works]

Run the query and you will get the following result. The call values are the same as before, but now we have the year 2008 on the rows axis, and the Internet Sales Amount on the columns axis.

 

Optimizing MDX queries using the NonEmpty() function


The NonEmpty() function is a very powerful MDX function. It is primarily used to improve query performance by reducing sets before the result is returned.

Both Customer and Date dimensions are relatively large in the Adventure Works DW 2012 database. Putting the cross product of these two dimensions on the query axis can take a long time. In this recipe, we'll show how the NonEmpty() function can be used on the Customer and Date dimensions to improve the query performance.

Getting ready

Start a new query in SSMS and make sure that you're working on the Adventure Works DW 2012 database. Then write the following query and execute it:

SELECT 
    { [Measures].[Internet Sales Amount] } ON 0,
    NON EMPTY
    Filter(
            { [Customer].[Customer].[Customer].MEMBERS } *
            { [Date].[Date].[Date].MEMBERS },
            [Measures].[Internet Sales Amount] > 1000
           ) ON 1
FROM
   [Adventure Works]

The query shows the sales per customer and dates of their purchases, and isolates only those combinations where the purchase was over 1000 USD.

On a typical server, it will take more than a minute before the query will return the results.

Now let's see how to improve the execution time by using the NonEmpty() function.

How to do it…

Follow these steps to improve the query performance by adding the NonEmpty() function:

  1. Wrap NonEmpty() around the cross join of customers and dates so that it becomes the first argument of that function.

  2. Use the measure on columns as the second argument of that function.

  3. This is what the MDX query should look like:

    SELECT 
        { [Measures].[Internet Sales Amount] } ON 0,
    NON EMPTY
        Filter(
          NonEmpty(
                    { [Customer].[Customer].[Customer].MEMBERS } *
                    { [Date].[Date].[Date].MEMBERS },
                    { [Measures].[Internet Sales Amount] }
                   ),
          [Measures].[Internet Sales Amount] > 1000
               ) ON 1
    FROM 
       [Adventure Works]
  4. Execute that query and observe the results as well as the time required for execution. The query returned the same results, only much faster, right?

How it works…

Both the Customer and Date dimensions are medium-sized dimensions. The cross product of these two dimensions contains several million combinations. We know that typically, the cube space is sparse; therefore, many of these combinations are indeed empty. The Filter() operation is not optimized to work in block mode, which means a lot of calculations will have to be performed by the engine to evaluate the set on rows, whether the combinations are empty or not.

Fortunately, the NonEmpty() function exists. This function can be used to reduce any set, especially multidimensional sets that are the result of a cross join operation. It removes the empty combinations of the two sets before the engine starts to evaluate the sets on rows. A reduced set has fewer cells to be calculated, and therefore the query runs much faster.

There's more…

Regardless of the benefits that were shown in this recipe, NonEmpty() should be used with caution. Here are some good practices regarding the NonEmpty() function:

  • Use it with sets, such as named sets and axes.

  • Use it in the functions which are not optimized to work in block mode, such as with the Filter() function.

  • Avoid using it in aggregate functions such as Sum().

  • Avoid using it in other MDX set functions that are optimized to work in block mode. The use of NonEmpty() inside optimized functions will prevent them from evaluating the set in block mode. This is because the set will not be compact once it passes the NonEmpty() function. The function will break it into many small non-empty chunks, and each of these chunks will have to be evaluated separately. This will inevitably increase the duration of the query. In such cases, it is better to leave the original set intact, no matter its size. The engine will know how to run over it in optimized mode.

NonEmpty() versus NON EMPTY

Both the NonEmpty() function and the NON EMPTY keyword can reduce sets, but they do it in a different way.

The NON EMPTY keyword removes empty rows, columns, or both, depending on the axis on which that keyword is used in the query. Therefore, the NON EMPTY operator tries to push the evaluation of cells to an early stage whenever possible. This way the set on axis becomes already reduced and the final result is faster.

Take a look at the initial query in this recipe, remove the Filter() function, run the query, and notice how quickly the results come, although the multidimensional set again counts millions of tuples. The trick is that the NON EMPTY operator uses the set on the opposite axis, the columns, to reduce the set on rows. Therefore, it can be said that NON EMPTY is highly dependent on members on axes and their values in columns and rows.

Contrary to the NON EMPTY operator found only on axes, the NonEmpty() function can be used anywhere in the query.

The NonEmpty() function removes all the members from its first set, where the value of one or more measures in the second set is empty. If no measure is specified, the function is evaluated in the context of the current member.

In other words, the NonEmpty() function is highly dependent on members in the second set, the slicer, or the current coordinate, in general.

Common mistakes and useful tips

If a second set in the NonEmpty() function is not provided, the expression is evaluated in the context of the current measure in the moment of evaluation, and current members of attribute hierarchies, also in the time of evaluation. In other words, if you're defining a calculated measure and you forget to include a measure in the second set, the expression is evaluated for that same measure which leads to null, a default initial value of every measure. If you're simply evaluating the set on the axis, it will be evaluated in the context of the current measure, the default measure in the cube or the one provided in the slicer. Again, this is perhaps not something you expected. In order to prevent these problems, always include a measure in the second set.

NonEmpty() reduces sets, just like a few other functions, namely Filter() and Existing() do. But what's special about NonEmpty() is that it reduces sets extremely efficiently and quickly. Because of that, there are some rules about where to position NonEmpty() in calculations made by the composition of MDX functions (one function wrapping the other). If we're trying to detect multi-select, that is, multiple members in the slicer, NonEmpty() should go inside with the EXISTING function/keyword outside. The reason is that although they both shrink sets efficiently, NonEmpty() works great if the set is intact. EXISTING is not affected by the order of members or compactness of the set. Therefore, NonEmpty() should be applied earlier.

You may get System.OutOfMemory errors if you use the CrossJoin() operation on many large hierarchies because the cross join generates a Cartesian product of those hierarchies. In that case, consider using NonEmpty() to reduce the space to a smaller subcube. Also, don't forget to group the hierarchies by their dimension inside the cross join.

 

Using the PROPERTIES() function to retrieve data from attribute relationships


Attribute relationships define hierarchical dependencies between attributes. A good example is the relationship between attribute City and attribute State. If we know the current city is Phoenix, we know the state must be Arizona. This knowledge of the relationship, City | State, can be used by the Analysis Services engine to optimize performance.

Analysis Services provides the Properties() function to allow us to retrieve data based on attribute relationships.

Getting ready

We will start from a classic Top 10 query that shows the Top 10 Customers. Then we will use the Properties() function to retrieve each top 10 customer's yearly income.

This table shows what our query result should be like:

 

Internet Sales Amount

Yearly Income

Nichole Nara

$13,295.38

100000 - 120000

Kaitlyn J. Henderson

$13,294.27

100000 - 120000

Margaret He

$13,269.27

100000 - 120000

Randall M. Dominguez

$13,265.99

80000 - 90000

Adriana L. Gonzalez

$13,242.70

80000 - 90000

Rosa K. Hu

$13,215.65

40000 - 70000

Brandi D. Gill

$13,195.64

100000 - 120000

Brad She

$13,173.19

80000 - 90000

Francisco A. Sara

$13,164.64

40000 - 70000

Maurice M. Shan

$12,909.67

80000 - 90000

Once we get only the top 10 customers, it's easy enough to place the customer on the rows, and the Internet sales amount on the columns. What about each customer's yearly income?

Customer geography is a user-defined hierarchy in the customer dimension. In the SSMS, if you start a new query against the Adventure Works DW 2012 database, and navigate to Customer | Customer Geography | Customer | Member Properties, you will see that the yearly income is one of the member properties for the attribute Customer. This is a good news, because now we can surely get the Yearly Income for each top 10 customer using the PROPERTIES() function:

How to do it…

In SSMS, let us write the following query in a new Query Editor against the Adventure Works DW 2012 database:

  1. This query uses the TopCount() function which takes three parameters. The first parameter [Customer].[Customer Geography].[Customer].MEMBERS provides the members that will be evaluated for the "top count", the second integer 10 tells it to return only 10 members and the third parameter [Measures].[Internet Sales Amount] provides a numeric measure as the evaluation criteria.

    -- Properties(): Initial
    SELECT
       [Measures].[Internet Sales Amount] on 0,
       TopCount(
          [Customer].[Customer Geography].[Customer].MEMBERS,
          10,
          [Measures].[Internet Sales Amount]
          ) ON 1
    FROM
       [Adventure Works]
  2. Execute the preceding query, and we should get only 10 customers back with their Internet sales amount. Also notice that the result is sorted in the descending order of the numeric measure. Now let's add a calculated measure, like:

    [Customer].[Customer Geography].currentmember.Properties("Yearly Income")
  3. To make the calculated measure "dynamic", we must use a member function .CurrentMember, so we do not need to hardcode any specific member name on the customer dimension. The Properties() function is also a member function, and it takes another attribute name as a parameter. We've provided "Yearly Income" as the name for the attribute we are interested in.

  4. Now place the preceding expression in the WITH clause, and give it a name [Measures].[Yearly Income]. This new calculated measure is now ready to be placed on the columns axis, along with the Internet sales amount. Here is the final query:

    WITH
    MEMBER [Measures].[Yearly Income] AS
       [Customer].[Customer Geography].currentmember
         .Properties("Yearly Income")
    
    SELECT
       { [Measures].[Internet Sales Amount],
         [Measures].[Yearly Income]
       } on 0,
       TopCount(
          [Customer].[Customer Geography].[Customer].MEMBERS,
          10,
          [Measures].[Internet Sales Amount]
          ) ON 1
    FROM
       [Adventure Works]
  5. Executing the query, we should get the yearly income for each top 10 customer. The result should be exactly the same as the table shown at the beginning of our recipe.

How it works…

Attributes correspond to columns in the dimension tables in our data warehouse. Although we don't normally define the relationship between them, in the relationship database, we do so in the multidimensional space. This knowledge of attribute relationships can be used by the Analysis Services engine to optimize the performance. MDX has provided us the Properties() function to allow us to get from members of one attribute to members of another attribute.

In this recipe, we only focus on one type of member properties, that is, the user-defined member property. Member properties can also be the member properties that are defined by Analysis Services itself, such as NAME, ID, KEY, or CAPTION; they are the intrinsic member properties.

There's more…

The Properties() function can take another optional parameter, that is the TYPED flag. When the TYPED flag is used, the return value has the original type of the member.

The preceding example does not use the TYPED flag. Without the TYPED flag, the return value is always a string.

In many business analysis, we perform arithmetical operations numerically. In the next example, we will include the TYPED flag in the Properties() function to make sure that the [Total Children] for the top 10 customers are numeric.

WITH
MEMBER [Measures].[Yearly Income] AS
    [Customer].[Customer Geography].currentmember.Properties("Yearly Income")
MEMBER [Measures].[Total Children] AS
    [Customer].[Customer Geography].currentmember.Properties("Total Children", TYPED)
MEMBER [Measures].[Is Numeric] AS
    IIF(
       IsNumeric([Measures].[Total Children]),
       1,
       NULL
       )

SELECT
    { [Measures].[Internet Sales Amount],
      [Measures].[Yearly Income],
      [Measures].[Total Children],
      [Measures].[Is Numeric]
    } ON 0,
    TopCount(
     [Customer].[Customer Geography].[Customer].MEMBERS,
     10,
     [Measures].[Internet Sales Amount]
     ) ON 1
FROM
    [Adventure Works]

Attributes can be simply referenced as an attribute hierarchy, that is, when the attribute is enabled as an Attribute Hierarchy.

In SSAS, there is one situation where the attribute relationship can be explored only by using the PROPERTIES() function, that is when its property AttributeHierarchyEnabled is set to False.

In the employee dimension in the Adventure Works cube, employees' SSN numbers are not enabled as an Attribute Hierarchy. Its property AttributeHierarchyEnabled is set to False. We can only reference the SSN number in the PROPERTIES() function of another attribute that has been enabled as Attribute Hierarchy, such as the Employee attribute.

 

Basic sorting and ranking


Sorting and ranking are very common requirements in most business analysis, and MDX provides several functions for this purpose. They are:

  • TopCount and BottomCount

  • TopPercent and BottomPercent

  • TopSum and BottomSum

  • Order

  • Hierarchize

  • Rank

All of these functions operate on sets of tuples, not just on one-dimensional sets of members. They all, in some way, involve a numeric expression, which is used to evaluate the sorting and the ranking.

Getting ready

We will start with the classic Top 5 (or Top-n) example using the TopCount() function. We will then examine how the result is already pre-sorted, followed by using the Order() function to sort the result explicitly. Finally, we will see how we can add a ranking number by using the Rank() function.

Here is the classic Top 5 example using the TopCount() function

TopCount (
        [Product].[Subcategory].children,
        5,
        [Measures].[Internet Sales Amount] 
  )

It operates on a tuple ([Product].[Subcategory].children, [Measures].[Internet Sales Amount]).

The result is the five [Subcategory] that has the highest [Internet Sales Amount].

The five subcategory members will be returned in order from the largest [Internet Sales Amount] to the smallest.

How to do it…

In SSMS, let us write the following query in a new Query Editor, against the Adventure Works DW 2012 database. Follow these steps to first get the top-n members:

  1. We simply place the earlier TopCount() expression on the rows axis.

  2. On the columns axis, we are showing the actually sales amount for each product subcategory.

  3. In the slicer, we use a tuple to slice the result for the year 2008 and the Southwest only.

  4. The final query should look like the following query:

    SELECT
        [Measures].[Internet Sales Amount] on 0,
        TopCount (
            [Product].[Subcategory].children,
            5,
            [Measures].[Internet Sales Amount] 
        ) ON 1
    FROM
       [Adventure Works]
    WHERE
       ( [Date].[Calendar].[Calendar Quarter].&[2008]&[1],
         [Sales Territory].[Sales Territory Region].[Southwest]
       )
  5. Run the query. The following screenshot shows the Top-n result:

  6. Notice that the returned members are in order from the largest numeric measure to the smallest.

Next, in SSMS, follow these steps to explicitly sort the result:

  1. This time, we will put the TopCount() expression in the WITH clause, creating it as a Named Set. We will name it [Top 5 Subcategory].

  2. On the rows axis, we will use the Order() function, which takes two parameters: which members we want to return and what value we want to evaluate on for sorting. The named set [Top 5 Subcategory] is what we want to return, so we will pass it to the Order() function as the first parameter. The .MemberValue function gives us the product subcategory name, so we will pass it to the Order() function as the second parameter. Here is the Order() function expression we would use:

    ORDER (
             [Top 5 Subcategory],
             [Product].[Subcategory].MEMBERVALUE
        )
  3. Here is the final query for sorting the result:

    -- Order members with MemberValue
    WITH
    SET [Top 5 Subcategory] as
       TopCount (
           [Product].[Subcategory].CHILDREN,
           5,
           [Measures].[Internet Sales Amount]
       )
    
    SELECT
        [Measures].[Internet Sales Amount] on 0,
        ORDER (
            [Top 5 Subcategory],
            [Product].[Subcategory].MEMBERVALUE
        ) ON 1
    FROM
        [Adventure Works]
    WHERE
        ( [Date].[Calendar].[Calendar Quarter].&[2008]&[1],
          [Sales Territory].[Sales Territory 
            Region].[Southwest] )
  4. Executing the preceding query, we get the sorted result as the screenshot shows:

Finally, in SSMS follow these steps to add ranking numbers to the Top-n result:

  1. We will create a new calculated measure [Subcategory Rank] using the Rank() function, which is simply putting a one-based ordinal position of each tuple in the set [Top 5 Subcategory]. Since the set is already ordered, the ordinal position of the tuple will give us the correct ranking. Here is the expression for the Rank() function:

       RANK (
           [Product].[Subcategory].CurrentMember,
           [Top 5 Subcategory]
       )
  2. The following query is the final query. It is built on top of the first query in this recipe. We've added the earlier Rank() function and created a calculated measure [Measures].[Subcategory Rank], which is placed on the columns axis along with the Internet Sales Amount.

    WITH
    SET [Top 5 Subcategory] AS
       TopCount (
           [Product].[Subcategory].children,
           5,
           [Measures].[Internet Sales Amount] 
       )
    MEMBER [Measures].[Subcategory Rank] AS
        RANK ( 
           [Product].[Subcategory].CurrentMember, 
           [Top 5 Subcategory]
       )
    
    SELECT
        { [Measures].[Internet Sales Amount],
          [Measures].[Subcategory Rank]
        } ON 0,
        [Top 5 Subcategory] ON 1	
    FROM
        [Adventure Works]
    WHERE
        ( [Date].[Calendar].[Calendar Quarter].&[2008]&[1],
          [Sales Territory].[Sales Territory Region].[Southwest] )
  3. Run the preceding query. The ranking result is shown in the following screenshot:

How it works…

Sorting functions, such as TopCount(), TopPercent(), and TopSum() operate on sets of tuples. These tuples are evaluated on a numeric expression and returned pre-sorted in the order of a numeric expression.

Using the Order() function, we can sort members from dimensions explicitly using the .MemberValue function.

When a numeric expression is not specified, the Rank() function can simply be used to display the one-based ordinal position of tuples in a set.

There's more…

Like the other MDX sorting functions, the Rank() function, however, can also operate on a numeric expression. If a numeric expression is specified, the Rank() function assigns the same rank to tuples with duplicate values in the set.

It is also important to understand that the Rank() function does not order the set. Because of this fact, we tend to do the ordering and ranking at the same time. However, in the last query of this recipe, we actually used the Order() function to first order the set of members of the subcategory. This way, the sorting is done only once and then followed by a linear scan, before being presented in sorted order.

As a good practice, we recommend using the Order() function to first order the set and then ranking the tuples that are already sorted.

 

Handling division by zero errors


Handling errors is a common task, especially the handling of division by zero type errors. This recipe offers a common practice to handle them.

Getting ready

Start a new query in SQL Server Management Studio and check that you're working on Adventure Works database. Then write and execute this query:

WITH
MEMBER [Date].[Calendar Year].[CY 2006 vs 2005 Bad] AS
   [Date].[Calendar Year].[Calendar Year].&[2006] /
   [Date].[Calendar Year].[Calendar Year].&[2005],
   FORMAT_STRING = 'Percent'
SELECT
   { [Date].[Calendar Year].[Calendar Year].&[2005],
     [Date].[Calendar Year].[Calendar Year].&[2006],
     [Date].[Calendar Year].[CY 2006 vs 2005 Bad] } *
     [Measures].[Reseller Sales Amount] ON 0,
   { [Sales Territory].[Sales Territory].[Country].MEMBERS }
   ON 1
FROM
   [Adventure Works]

This query returns six countries on the rows axis, and two years and a ratio on the column axis.

The problem is that we get 1.#INF on some ratio cells. 1.#INF is the formatted value of infinity, and it appears whenever the denominator CY 2005 is null and the nominator CY 2006 is not null.

We will need help from the IIF() function, which takes three arguments: iif(<condition>, <then branch>, <else branch>). The IIF() function is a Visual Basic for Applications (VBA) function and has a native implementation in MDX. The IIF ( ) function will allow us to evaluate the condition of CY 2005, then decide what the ratio calculation formula should be.

How to do it…

Follow these steps to handle division by zero errors:

  1. Copy the calculated member and paste it as another calculated member. During that, replace the term Bad with Good in its name, just to differentiate between those two members.

  2. Copy the denominator.

  3. Wrap the expression in an outer IIF() statement.

  4. Paste the denominator in the condition part of the IIF() statement and compare it against 0.

  5. Provide null value for the true part.

  6. Your initial expression should be in the false part.

  7. Don't forget to include the new member on columns and execute the query:

    WITH
    MEMBER [Date].[Calendar Year].[CY 2006 vs 2005 Bad] AS
       [Date].[Calendar Year].[Calendar Year].&[2006] /
       [Date].[Calendar Year].[Calendar Year].&[2005],
       FORMAT_STRING = 'Percent'
    MEMBER [Date].[Calendar Year].[CY 2006 vs 2005 Good] AS
       IIF([Date].[Calendar Year].[Calendar Year].&[2005] = 0,
           null,
           [Date].[Calendar Year].[Calendar Year].&[2006] /
           [Date].[Calendar Year].[Calendar Year].&[2005]
          ),
       FORMAT_STRING = 'Percent'
    SELECT
       { [Date].[Calendar Year].[Calendar Year].&[2005],
         [Date].[Calendar Year].[Calendar Year].&[2006],
         [Date].[Calendar Year].[CY 2006 vs 2005 Bad],
         [Date].[Calendar Year].[CY 2006 vs 2005 Good] } *
         [Measures].[Reseller Sales Amount] ON 0,
       { [Sales Territory].[Sales Territory].[Country].MEMBERS }
       ON 1
    FROM
       [Adventure Works]

The result shows that the new calculated measure has corrected the problem. The last column [CY 2006 vs 2005 Good] is now showing (null) correctly when the denominator CY 2005 is null and the nominator CY 2006 is not null.

How it works…

A division by zero error occurs when the denominator is null or zero and the numerator is not null. In order to prevent this error, we must test the denominator before the division and handle the two scenarios in the two branches using the IIF() statement.

In the condition part of the IIF statement, we've used a simple scalar number zero to determine if [Measures].[Reseller Sales Amount] in the following slicer is zero or not. If it is zero, then it will be true and the calculated member will be NULL:

[Date].[Calendar Year].[Calendar Year].&[2005] = 0

What about the NULL condition? It turned out for a numerical value; we do not need to test the NULL condition specifically. It is enough to test just for zero because null = 0 returns true. However, we could test for NULL condition if we want to, by using the IsEmpty() function.

For the calculated member [CY 2006 vs 2005 Good] we could wrap the member with the IsEmpty() function. The result will be the same:

MEMBER [Date].[Calendar Year].[CY 2006 vs 2005 Good] AS
   IIF(IsEmpty([Date].[Calendar Year].[Calendar Year].&[2005]),
       null,
       [Date].[Calendar Year].[Calendar Year].&[2006] /
       [Date].[Calendar Year].[Calendar Year].&[2005]
      ),
   FORMAT_STRING = 'Percent'

There's more…

SQLCAT's SQL Server 2008 Analysis Services Performance Guide has a lot of interesting details regarding the IIF() function, found at http://tinyurl.com/PerfGuide2008R2.

Additionally, you may find the blog article MDX and DAX topics by Jeffrey Wang explaining the details of the IIF() function, found at http://tinyurl.com/IIFJeffrey.

Earlier versions of SSAS

If you're using a version of SSAS prior to 2008 (that is, 2005), the performance of the IIF() function will not be as good. See Mosha Pasumansky's article for more information: http://tinyurl.com/IIFMosha.

 

Setting a default member of a hierarchy in the MDX script


Setting a default member is a tempting option which looks like it can be used on any dimension we would like to use it on. The truth is far from that. Default members should be used as exceptions and not as a general rule when designing dimensions.

The reason for that is not so obvious. The feature looks self-explanatory, and it's hard to anticipate what could go wrong. If we're not careful enough, our calculations can become unpredictable, especially on complex dimensions with many relationships among attributes.

Default members can be defined in three places. The easy-to-find option is the dimension itself, using the DefaultMember property found on every attribute. The second option is the role, on Dimension Data tab. Finally, default members can be defined in the MDX script. One of the main benefits of this place is easy maintenance of all default members in the cube because everything is in one place, and in the form of an easy-to-read text. That is also the only way to define the default member of a role-playing dimension.

In this recipe we'll show the most common option, that is, the last one, or how to set a default member of a hierarchy in the MDX script. More information on setting the DefaultMember is available at http://tinyurl.com/DefaultMember2012.

Getting ready

Follow these steps to set up the environment for this recipe:

  1. Start SSMS and connect to your SSAS 2012 instance.

  2. Click on the New Query button and check that the target database is Adventure Works DW 2012. Then execute the following query:

    WITH
    MEMBER [Measures].[Default account] AS
         [Account].[Accounts].DefaultMember.Name
    SELECT
       { [Measures].[Amount],
         [Measures].[Default account] } ON 0
    FROM
       [Adventure Works]
  3. The results will show that the default member is Net Income account and its value in this context is a bit more than 12.6 million USD.

  4. Next, open Adventure Works DW 2012 solution in SSDT.

  5. Double-click on the Adventure Works cube and go to the Calculations tab. Choose Script View.

  6. Position the cursor at the beginning of the script, just beneath the CALCULATE command.

How to do it…

Follow these steps to set a new default member:

  1. Enter the following expression to set a new default account:

    ALTER CUBE CurrentCube 
        UPDATE DIMENSION [Account].[Accounts],
          Default_Member = [Account].[Accounts].&[48];
                           //Operating Profit
  2. Save and deploy (or just press the Deploy MDX Script icon if you're using BIDS Helper 2012).

  3. Run the previous query again.

  4. Notice that the result has changed. The new default account is Operating Profit, the one we specified in the MDX script using ALTER CUBE command. The value changed as well – now it's above 16.7 million USD.

How it works…

The ALTER CUBE statement changes the default member of a hierarchy specified in the UPDATE DIMENSION part of the statement. The third part is where we specify which member should be the default member of that hierarchy.

Don't mind that it says UPDATE DIMENSION. SSAS 2005 interprets that as a hierarchy.

There's more…

Setting the default member on a dimension with multiple hierarchies can lead to unexpected results. Due to attribute relations, related attributes are implicitly set to corresponding members, while the non-related attributes remain on their default members, that is, the All member (also known as the root member). Certain combinations of members from all available hierarchies can result in a nonexisting coordinate. In that case, the query will return no data. Other times, the intersection will only be partial. In that case, the query will return the data, but the values will not be correct, which might be even worse than no data at all.

Enter the following expression in the MDX script, deploy it, and then analyze the result in the Cube Browser tab:

ALTER CUBE CurrentCube 
    UPDATE DIMENSION [Date].[Calendar],
      Default_Member = [Date].[Calendar]
                      .[Calendar Year].&[2007];
                       -- "current" year on the user hierarchy

The expression sets the year 2007 as the default member of the [Date].[Calendar] user-defined hierarchy.

The analysis of the Sales Amount measure in the Cube Browser shows good results in almost all cases except in a few. Fiscal hierarchies that have the fiscal year level in them return empty or incomplete results when used in a slicer. They are empty because the intersection between the fiscal year 2006 and the calendar year 2007 (the latter being the default member in the calendar hierarchy) is a nonexisting combination. Remember, the calendar year 2007 doesn't get overwritten by the fiscal year 2006. It gets combined (open the Date dimension in SSDT and observe the relationships in the corresponding tab). Moreover, when you put the fiscal year 2007 into the slicer, you only get a portion of data, the portion which matches the intersection of the calendar and the fiscal year. That's only one half of the fiscal year, right? In short, you have a potential problem with this approach.

Can we fix the result? Yes, we can. The correct results will be there when we explicitly select the All member from the Date.Calendar hierarchy in the slicer. Only then will we get good results using fiscal hierarchies. The question is – will the end users remember that every time?

The situation is similar when the default member is defined on an attribute hierarchy, for example, on the Date.Calendar Year hierarchy. By now, you should be able to modify the previous expression so that it sets the year 2007 as the default member on the [Date].[Calendar Year]. Test this to see it for yourself.

Another scenario could be that you want to put the current date as the default member on the Date.Date hierarchy. Try that too, and see that when you use the year 2006 from the Date.Calendar Year hierarchy in the slicer, you get an empty result. Again, the intersection formed a nonexisting coordinate.

To conclude, you should avoid defining default members on complex dimensions. Define them where it is appropriate: on dimensions with a single non-aggregatable attribute (that is, when you set the IsAggregatable property of an attribute to False) or on dimensions with one or more user hierarchies where that non-aggregatable attribute is the top level on each user hierarchy, and where all relationships are well defined.

The Account dimension used in this example is not such a dimension. In order to correct it, two visible attributes should be hidden because they can cause empty results when used in a slicer. Experimenting with a scope might help too, but that adds to the complexity of the solution and hence the initial advice of keeping things simple when using default members should prevail.

Take a look at other dimensions in the Adventure Works DW 2012 database. There you will find good examples of using default members.

Helpful tips

When you're defining the default members in an MDX script, do it at the beginning of the script. This way the calculations that follow can reference them.

In addition, provide a comment explaining which member was chosen to be the default member, and perhaps why. Look back at the code in this recipe to see how it was done.

About the Authors

  • Sherry Li

    Sherry Li is an Analytic Consultant who works for a major financial organization with responsibilities in implementing data warehousing, Business Intelligence, and business reporting solutions. She specializes in automation and optimization of data gathering, storing, analyzing and providing data access for business to gain data-driven insights. She especially enjoys sharing her experience and knowledge in data ETL process, database design, dimensional modeling, and reporting in T-SQL and MDX. She has co-authored two books, the MDX with SSAS 2012 Cookbook and MDX with Microsoft SQL Server 2016 Analysis Services Cookbook, which have helped many data professionals advanced their MDX skill in a very short time.

    Browse publications by this author
  • Tomislav Piasevoli

    Tomislav Piasevoli is a Business Intelligence (BI) specialist with years of experience working with Microsoft SQL Server Analysis Services (SSAS). He successfully implemented many still-in-use BI solutions, helped numerous people on MSDN forum, achieved the highest certification for SQL Server Analysis Services (SSAS Maestro), and shared his expertise in form of MDX cookbooks.

    Tomislav currently works as a consultant at Piasevoli Analytics company (www.piasevoli.com) together with his brother Hrvoje. They specialize in Microsoft SQL Server Business Intelligence platform, SSAS primarily, and offer their BI skills worldwide.

    In addition to his regular work, Tomislav manages to find the time to present at local conferences or to write an article or two for local magazines. His contribution to the community has been recognized by Microsoft honoring him with the Most Valuable Professional (MVP) award for six consecutive years (2009-2015).

    Browse publications by this author

Latest Reviews

(4 reviews total)
everything fine
Managed to get the ebook. No problem!
NOT FOUND - and No reply from you! THIEVES.
MDX with SSAS 2012 Cookbook
Unlock this book and the full library for $5 a month*
Start now