Reader small image

You're reading from  Learning Tableau 2022 - Fifth Edition

Product typeBook
Published inAug 2022
PublisherPackt
ISBN-139781801072328
Edition5th Edition
Tools
Right arrow
Author (1)
Joshua N. Milligan
Joshua N. Milligan
author image
Joshua N. Milligan

Joshua N. Milligan is a Hall of Fame Tableau Zen Master and 2017 Iron Viz Global finalist. His passion is training, mentoring, and helping people gain insights and make decisions based on their data through data visualization using Tableau and data cleaning and structuring using Tableau Prep. He is a principal consultant at Teknion Data Solutions, where he has served clients in numerous industries since 2004.
Read more about Joshua N. Milligan

Right arrow

Structuring Messy Data to Work Well in Tableau

So far, most of the examples we’ve looked at in this book assume that the data used is structured well and is fairly clean. Data in the real world isn’t always so pretty. Maybe it’s messy or it doesn’t have a good structure. It may be missing values or have duplicate values, or it might have the wrong level of detail.

How can you deal with this type of messy data? In the previous chapter, we considered how Tableau’s data model can be used to relate data in different tables. We will consider Tableau Prep Builder as a robust way to clean and structure data in the next chapter. Much of the information in this chapter will be an essential foundation for working with Tableau Prep Builder.

For now, let’s focus on some of the basic data structures that work well in Tableau and some of the additional techniques you can use to get data into those structures. We’ll keep our discussion limited...

Structuring data for Tableau

We’ve already seen that Tableau can connect to nearly any data source. Whether it’s a built-in direct connection to a file or database, data obtained through a custom Web Data Connector (WDC), or through the Tableau data extract API to generate an extract, no data is off limits. However, there are certain structures that make data easier to work with in Tableau.

There are two keys to ensure a good data structure that works well with Tableau:

  • Every record of a source data connection should be at a meaningful level of detail
  • Every measure contained in the source should match the level of detail of the data source or possibly be at a higher level of detail, but it should never be at a lower level of detail

For example, let’s say you have a table of test scores with one record per classroom in a school. Within the record, you may have three measures: the average GPA for the classroom, the number of students...

The four basic data transformations

In this section, we’ll give you an overview of some basic transformations that can fundamentally change the structure of your data. We’ll start with an overview and then look at some practical examples.

Overview of transformations

In Tableau (and Tableau Prep), there are four basic data transformations. The following definitions broadly apply to most databases and data transformation tools, but there are some details and terminology that are Tableau-specific:

  • Pivots: This indicates the transformation of columns to rows or rows to columns. The latter is possible in Tableau Prep only. The resulting dataset will be narrower and taller with fewer columns and more rows (columns to rows) or wider and shorter with more columns and fewer rows (rows to columns).
  • Unions: This indicates the appending of rows from one table of data to another, with the matching columns aligned together. The resulting data structure is...

Overview of advanced fixes for data problems

In addition to the techniques that we mentioned earlier in this chapter, there are some additional possibilities to deal with data structure issues.

It is outside the scope of this book to develop these concepts fully. However, with some familiarity of these approaches, you can broaden your ability to deal with challenges as they arise:

  • Custom SQL: This can be used in the data connection to resolve some data problems. Beyond giving a field for a cross-database join, as we saw earlier, custom SQL can be used to radically reshape the data that’s retrieved from the source. Custom SQL is not an option for all data sources, but it is an option for many relational databases. Consider a custom SQL script that takes the wide table of country populations we mentioned earlier in this chapter and restructures it into a tall table:
    SELECT [Country Name],[1960] AS Population, 1960 AS Year 
    FROM Countries 
     
    UNION ALL ...

Summary

Up until this chapter, we’d looked at data that was, for the most part, well structured and easy to use. In this chapter, we considered what constitutes a good structure and ways to deal with poorly structured data. A good structure consists of data that has a meaningful level of detail and that has measures that match that level of detail. When measures are spread across multiple columns, we get data that is wide instead of tall.

We also spent some time understanding the basic types of transformation: pivots, unions, joins, and aggregations. Understanding these will be fundamental to solving data structure issues.

You also got some practical experience in applying various techniques to deal with data that has the wrong shape or has measures at the wrong level of detail. Tableau gives us the power and flexibility to deal with some of these structural issues, but it is far preferable to fix a data structure at the source.

In the next chapter, we’ll...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Learning Tableau 2022 - Fifth Edition
Published in: Aug 2022Publisher: PacktISBN-13: 9781801072328
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Author (1)

author image
Joshua N. Milligan

Joshua N. Milligan is a Hall of Fame Tableau Zen Master and 2017 Iron Viz Global finalist. His passion is training, mentoring, and helping people gain insights and make decisions based on their data through data visualization using Tableau and data cleaning and structuring using Tableau Prep. He is a principal consultant at Teknion Data Solutions, where he has served clients in numerous industries since 2004.
Read more about Joshua N. Milligan