Chapter 5. Adding Flavor – Creating Calculated Fields
In this chapter, we will be covering the following topics:
Creating string calculations
Creating arithmetic calculations
Creating date calculations
Creating logic calculations
Using Sets in calculations
Understanding Table Calculations
Understanding Level of Detail (LOD) Calculations
Understanding INCLUDE LOD
Understanding EXCLUDE LOD
Understanding FIXED LOD
Understanding how to create and use Parameters
In the previous chapter, we saw how to slice and dice our data. In this chapter, we will focus on creating calculated fields in Tableau to give us more flexibility and power. We will also learn how to make use of some predefined calculations and how one can change the scope and direction of these calculations to in turn affect the final result. We will then learn how to empower the end user by giving them the flexibility to pass values, which can then be called in the outcome by creating parameters.
Understanding how to create and use Calculations
When working with data, we may come across situations where we may want to go beyond what is available from our data. There are times when the fields available in our data won't be enough to fulfill our requirements. For example, our data may have fields such as Selling price and Cost price. However, we may want to find out how much profit we made. Also, we may want to conditionally format our view to highlight losses. In these cases, it makes sense to create computed fields on the data fields and use them in our analysis.
On a broad level, we will classify our calculations into four categories as follows:
String Calculations
Arithmetic Calculations
Date Calculations
Logic Calculations
In order to create a calculated field, we can click on the dropdown or right-click on any Dimension or Measure and select the option of Create | Calculated Field…. Refer to the following screenshot:
Another option to create a calculated field is from the toolbar. Select...
Creating string calculations
There are times when we are required to do some manipulations on fields that are stored as strings. Typical calculations that we may do on strings are as follows:
Concatenating separate strings into a single string or separate a string into smaller substrings
Changing the letter case (upper case to lower case and vice versa)
Extracting parts of the string to find certain substrings or find the position of a particular character or trim the string from a particular point
In the following recipe, let's do some string manipulations to create new calculated fields.
For the following recipe, we will use the Customer Name field from the Orders sheet from Sample - Superstore.xlsx
data and we will continue working in the same workbook, My first Tableau Workbook
.
Let us get started by creating a fresh new sheet.
Let us create a new sheet by pressing Ctrl + M and rename it String Calculation.
We will then drag the Customer Name field from the Dimensions...
Creating arithmetic calculations
In the previous recipe, we looked at String manipulations. In this section, let us focus on manipulations on numbers. However, before we move to arithmetic calculations, we must first understand the difference between aggregated and disaggregated measures. To elaborate on this, let us see an example. The two formulae are SUM(Profit) / SUM(Sales) and SUM(Profit / Sales).
The question is, whether these two formulae will give the same result or will they give different results?
From what we have learnt in school, we know that the Bracket operations or Parentheses will be computed first. We know this methodology as
PEMDAS (Parentheses, Exponents, Multiply, Divide, Add, Subtract). Since many people tend to say brackets instead of parentheses and orders instead of exponents, it is also referred to as
BODMAS (Brackets, Orders, Divide, Multiply, Add, Subtract) or BEDMAS (Brackets, Exponents, Divide, Multiply, Add, Subtract).
Based on this knowledge, we can clearly...
Creating date calculations
Working with Dates is an essential part of any analysis. However, it can also be tricky and messy at times depending how dates are stored in the databases and what manipulation we need to do on them. Typical analysis on Dates is to find out Sales trends over a period of time or what the year over year growth has been. For a customer centric organization, Date calculations will be useful for understanding the customer's buying patterns. Taking this point ahead, let us do some calculations to find out the First purchase date and the Last purchase date of customers. We will then use the Last purchase date to find the out how long it has been since the customers last purchased from us.
Let us see how this can be done in the following recipe.
For this recipe, we will work on the Order Date field and the Customer Name field from the Orders sheet of Sample - Superstore.xlsx
data. We will continue working in our existing workbook. Let's get started with the...
Creating logic calculations
In the previous recipes, we have looked at different calculations such as string, arithmetic, and date calculations. In this section, we will focus on the Logic statements. These are typically IF….ELSE statements or CASE statements.
Imagine having to conditionally format your data to highlight products having actual sales less than the target value. In order to address this requirement, we need to make use of Logic statements. These logic statements will execute a certain expression once a condition is met or else it will execute another expression.
Let's get started with Logic statements in Tableau.
Let's take an example where we will try to conditionally color code/highlight the product Sub-Categories that are having Sales less than 100K. We will use the fields from the Orders sheet of Sample - Superstore.xlsx
data and as usual, continue working in our existing workbook. Let's see how we can achieve the previously stated objective.
Using sets in calculations
In the previous chapter, we learnt what Sets are and how to create and use them. We saw an example of Top 5 customers by Sales. When we use this set, we can either use it as IN/OUT, which is a Boolean output, and which when used in our view, will give us two In and Out headers, or we can use it as Show Members in Set, which will only give us the names of our top five customers and filter the rest of the customers.
Now, what if we don't want to filter any customer, but instead group the customers who don't belong in the top five set (that is, the Out customers) under one header called Others? So, what we essentially want to show is the names of the top five customers by sales, that is, all the In customers and the rest of the customers to be grouped as Others.
Since we already have a set that gives us the customers belonging to the top five list and which customers do not belong to that list, we will use this set and achieve our objective. This is one example of how...
Understanding Table Calculations
Table calculations are one of the advanced levels of calculations available in Tableau. They are also one of the most powerful features of Tableau. The reason for calling them Table Calculations is because we can define the scope and direction of the calculation based on the table/view.
To elaborate this point further, let's take a look at an example. See the following screenshot:
The preceding screenshot shows the value of three products sold across three zones. The information is arranged in a crosstab manner where the zones are placed in the columns and products are placed in the rows. Also shown are the Row Totals, that is, totals for each product across all the zones and Column Totals, that is, totals for each zone across all the products and finally we also have the Grand Total, which is the total for the entire view, that is, across all products and across all the zones.
Now, if we want to show percentages instead of just looking at the absolute numbers...
Understanding Level of Detail (LOD) Calculations
In previous chapters, we learnt about how our Measures are dependent on the Dimensions that are present in our view. These Dimensions act as independent variables, whereas the Measures are dependent on those Dimensions. For example, imagine having a horizontal bar chart that is showing Sales across Region. Since our Orders sheet of Sample - Superstore.xlsx
data has four regions, we will get a bar chart showing four bars and the length of the bar representing the sales across regions. Now, if we get another Dimension, say Category, and place it right after the Region field in the Rows shelf, we will add an additional granularity of Category into the view and now instead of 4 bars, we will get 12 bars as we have three categories in our data and at this point, the Sales will be computed for each Category within a Region. Also, in the same view, if we remove Region, then we will have a bar chart with three bars showing sales across categories...
Understanding INCLUDE LOD
As mentioned earlier, the INCLUDE LOD expression, computes values using the specified dimensions in addition to whatever dimensions are present in the view. These expressions are most useful when including a dimension that is not part of our view as the results are computed at a lower granularity than the visualization's level of detail.
The syntax of using an INCLUDE LOD expression is { INCLUDE [Dimension] : aggregate expression}.
The preceding syntax shows how one can use an INCLUDE expression for a single dimension, however, if there is more than one dimension that we want to include in our calculation, then the syntax can be modified as { INCLUDE [Dimension1], [Dimension2], [Dimension3] : aggregate expression}.
Let us see how we can use the INCLUDE LOD expression in the following recipe.
Let's use the Sub-Category and Sales fields from the Orders sheet of Sample - Superstore.xlsx
data.
Let us create a new sheet by pressing Ctrl + M and rename...
Understanding EXCLUDE LOD
The EXCLUDE LOD expression is just the opposite of the INCLUDE LOD expression. As mentioned earlier, EXCLUDE LOD computes the values by omitting the dimensions from the view level of detail and it is most useful when excluding a dimension as the results are computed at a higher granularity than the visualization's level of detail.
The syntax of using the EXCLUDE LOD expression is { EXCLUDE [Dimension] : aggregate expression}.
The preceding syntax shows how one can use an EXCLUDE expression for a single dimension; however, if there is more than one dimension that we want to include in our calculation, then the syntax can be modified as { EXCLUDE [Dimension1], [Dimension2], [Dimension3] : aggregate expression}.
Let us see how we can use the EXCLUDE LOD expression in the following recipe.
Let us use the Category, Sub-Category, and Sales fields from the Orders sheet of Sample - Superstore.xlsx
data.
Let us start by duplicating the previous sheet...
The FIXED LOD expression, computes values using the specified dimensions without reference to any other dimensions in the view. These expressions are most useful when fixing a dimension regardless of what is and what is not part of our view. In this case, the results are computed at a granularity that is independent of the fields used in the visualization's level of detail.
The syntax of using FIXED LOD expressions is FIXED [Dimension] : aggregate expression}.
The preceding syntax shows how one can use a FIXED expression for a single dimension; however, if there is more than one dimension that we want to include in our calculation, then the syntax can be modified as {FIXED [Dimension1], [Dimension2], [Dimension3] : aggregate expression}.
Let us see how we can use the FIXED LOD expression in the following recipe.
Let us use the Customer Name and Order Date fields from the Orders sheet of Sample - Superstore.xlsx
data to find out how we can compute the first...
Understanding how to create and use Parameters
Being able to create custom calculations is a very useful functionality of any tool. It gives the user an immense flexibility and power to do a lot of things. However, there are times when we would want to go beyond the calculations that are based on static hard coded conditions. So, for example, in the Logic statement that we saw earlier, we had hard coded the Sales Target value to 100K. Now the problem is that as we continue with our business, the 100K target may soon become redundant and the calculations will soon have to be revisited. Plus, hard coding the Target at the development stage may not be a good idea as the end user may have some different values in mind. In this case, it makes sense to have the option where, the end user can pass certain values at the viewing level.
To address the preceding requirement, we will use the parameter functionality provided by Tableau. Parameters are controls that can be given to the end user to dynamically...