Reader small image

You're reading from  Tableau Prep Cookbook

Product typeBook
Published inMar 2021
PublisherPackt
ISBN-139781800563766
Edition1st Edition
Tools
Right arrow
Author (1)
Hendrik Kleine
Hendrik Kleine
author image
Hendrik Kleine

Hendrik Kleine is an advanced analytics leader with 15 years of experience in the analytics space, including in data architecture, engineering, and visualization. He specializes in translating vast amounts of data into easy-to-understand visual communications that provide actionable intelligence. He is an avid innovator and a listed author of multiple data-related inventions. Before COVID-19, he was a speaker at the most recent Tableau conference in San Francisco.
Read more about Hendrik Kleine

Right arrow

Chapter 4: Data Aggregation

Tableau Prep is designed with data preparation for analytics in mind. When it comes to reporting and analytics, more data is not always better, especially if you have a particular report in mind that you want to create. Pre-aggregating your data in a data preparation tool such as Tableau Prep instead of your business intelligence tool may result in significant performance gains when it comes to rendering your report.

In this chapter, you'll find recipes to help you prepare your data for analytics. Aggregation is a key part of data preparation. Aggregating your data appropriately in a Tableau Prep workflow can significantly reduce the output size. A smaller dataset will be more performant when connecting any analytics application, including Tableau Desktop.

In this chapter, we'll cover the following recipes:

  • Determining granularity
  • Aggregating values
  • Using fixed LOD calculations for grouping data
  • Grouping data

Technical requirements

To follow along with the recipes in this chapter, you will require Tableau Prep Builder and Tableau Desktop. We'll use the sample data supplied in the book's GitHub repository.

Determining granularity

One key consideration that is often overlooked is determining the granularity of the data that's needed. For example, when working with geographic data, you may have values for continent, region, country, state, city, ZIP code, street, and so on. But if you're only going to report on country data, you may not need all those other dimensions. Or perhaps you are processing order data; you may want to consider whether you need the details for each individual line item in each individual order – maybe your analysis will be fine with just the total order amount per day. In this recipe, we'll look at a quick method to help reveal the data actually in use in a Tableau Desktop visualization.

Getting ready

To follow along with this recipe, download the Sample Files 4.1 folder from this book's GitHub repository.

How to do it…

Start by opening the Superstore.tflx flow from the Sample Files 4.1 folder in Tableau Prep, then...

Aggregating values

There are several methods to pre-aggregate your data in your Tableau Prep pipeline. Ideally, your data will be aggregated in your data connection. For example, when connecting to a database, you may be able to write a query that includes a GROUP BY statement so that the data is aggregated before being ingested into Tableau Prep.

Often, such an ideal scenario is not available for a variety of reasons, and sometimes it is simply not possible, for example, when connecting to files such as Excel or CSV files.

In this recipe, we'll look at the preferred methods for most users when aggregating data in Tableau Prep, using the aptly named Aggregate step.

Getting ready

To follow along with this recipe, download the Sample Files 4.2 folder from this book's GitHub repository. In this flow, you'll find a slimmed-down version of the sample Superstore flow provided by Tableau.

The last step in this flow contains more than 20 fields and outputs more...

Using fixed LOD calculations for grouping data

Level of Detail or LOD calculations are calculation expressions that have been available in Tableau Desktop for some time. An LOD calculation allows you to aggregate your data at different levels of granularity within a single dataset.

For example, you might have a dataset with customer orders, where each row represents a single line item in an order. You might want to aggregate revenue by order, or by customer, without losing the granularity of your data. This is where LOD calculations come into play. In this recipe, you'll create an LOD calculation. In doing so, you'll group your data into distinct buckets and aggregate values in a single step.

Getting ready

To follow along with this recipe, download the Sample Files 4.3 folder from this book's GitHub repository. You must have Tableau Prep version 2020.1 or greater to leverage the LOD functionality.

How to do it…

Start by opening Tableau Prep and...

Grouping data

Grouping data in Tableau Prep can be done as part of the Aggregate step, as we've seen in the Aggregating values recipe earlier in this chapter. The function we'll review in this recipe is different, in that it can group values from a single field based on certain criteria.

As an example, values in a Name field might include John Smith and Smith, John. These might refer to the same person, and so we can group them together as John Smith. Performing this type of grouping is key to your data preparation efforts and ensures the downstream analysis does not run into issues with seemingly duplicate names.

Getting ready

To follow along with this recipe, download the Sample Files 4.4 folder from this book's GitHub repository.

How to do it…

Start by opening Tableau Prep and connect to the 2016 Sales.csv file from the Sample Files 4.4 folder in Tableau Prep, then follow these steps:

  1. Add a clean step to your flow and observe the values...
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Tableau Prep Cookbook
Published in: Mar 2021Publisher: PacktISBN-13: 9781800563766
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Hendrik Kleine

Hendrik Kleine is an advanced analytics leader with 15 years of experience in the analytics space, including in data architecture, engineering, and visualization. He specializes in translating vast amounts of data into easy-to-understand visual communications that provide actionable intelligence. He is an avid innovator and a listed author of multiple data-related inventions. Before COVID-19, he was a speaker at the most recent Tableau conference in San Francisco.
Read more about Hendrik Kleine