Reader small image

You're reading from  Learning Tableau 2019 - Third Edition

Product typeBook
Published inMar 2019
PublisherPackt
ISBN-139781788839525
Edition3rd Edition
Tools
Right arrow
Author (1)
Joshua N. Milligan
Joshua N. Milligan
author image
Joshua N. Milligan

Joshua N. Milligan is a Hall of Fame Tableau Zen Master and 2017 Iron Viz Global finalist. His passion is training, mentoring, and helping people gain insights and make decisions based on their data through data visualization using Tableau and data cleaning and structuring using Tableau Prep. He is a principal consultant at Teknion Data Solutions, where he has served clients in numerous industries since 2004.
Read more about Joshua N. Milligan

Right arrow

Cleaning and Structuring Messy Data

So far, most of the examples we've looked at in this book assume that data is structured well and is fairly clean. Data in the real world isn't always so pretty. Maybe it's messy or it doesn't have a good structure. It may be missing values or have duplicate values, or it might be at the wrong level of detail.

How can you deal with this messy data? We'll consider Tableau Prep Builder as a robust way to clean and structure data in the next chapter. For now, let's focus on the capabilities that are native to Tableau Desktop, which itself gives a lot of options and flexibility to deal with data issues. We'll take a look at some of the features and techniques that will enable you to overcome data structure obstacles. We'll also lay a solid foundation of a good data structure. Knowing what data structures work...

Structuring data for Tableau

We've already seen that Tableau can connect to nearly any data source. Whether it's a built-in direct connection, ODBC, or using the Tableau data extract API to generate an extract, no data is off limits. However, there are certain structures that make data easier to work with in Tableau.

There are two keys to ensuring a good data structure that works well with Tableau:

  • Every record of a source data connection should be at a meaningful level of detail
  • Every measure contained in the source should match the level of detail or possibly be at a higher level of detail, but should never be at a lower level of detail

For example, let's say you have a table of test scores with one record per classroom in a school. Within the record, you may have three measures: the average GPA for the classroom, the number of students in the class, and the...

Dealing with data structure issues

In some cases, restructuring data at the source is not an option. The source may be secured and read-only, or you might not even have access to the original data and instead receive periodic dumps of data in a specific format. In such cases, there are techniques for dealing with structural issues once you have connected to the data in Tableau.

We'll consider some examples of data structure issues to demonstrate various techniques for handling those issues in Tableau. None of the solutions are the only right way to resolve the given issue. Often, there are several approaches that might work. Additionally, these are only examples of issues you might encounter. Take some time to understand how the proposed solutions build on the foundational principals we've considered in previous chapters and how you can use similar techniques to solve...

Overview of advanced fixes for data problems

In addition to the techniques that we mentioned previously in this chapter, there are some additional possibilities for dealing with data structure issues. It is outside the scope of this book to develop these concepts fully. However, with some familiarity of these approaches, you can broaden your ability to deal with challenges as they arise:

  • Custom SQL can be used in the data connection to resolve some data problems. Beyond giving a field for a cross database join, as we saw previously, custom SQL can be used to radically reshape the data that's retrieved from the source. Custom SQL is not an option for all data sources, but is for many relational databases. Consider a custom SQL script that takes the wide table of country populations we mentioned earlier in this chapter and restructures it into a tall table:
SELECT [Country...

Summary

Up until this chapter, we'd looked at data which was, for the most part, well-structured and easy to use. In this chapter, we considered what constitutes good structure and ways to deal with poor data structure. A good structure consists of data that has a meaningful level of detail and that has measures that match that level of detail. When measures are spread across multiple columns, we get data that is wide instead of tall.

Now, you've got some experience in applying various techniques to deal with data that has the wrong shape or has measures at the wrong level of detail. Tableau gives us the power and flexibility to deal with some of these structural issues, but it is far preferable to fix a data structure at the source.

In the next chapter, we'll take a brief pause from looking at Tableau Desktop to consider another alternative to tackling challenging...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Learning Tableau 2019 - Third Edition
Published in: Mar 2019Publisher: PacktISBN-13: 9781788839525
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Joshua N. Milligan

Joshua N. Milligan is a Hall of Fame Tableau Zen Master and 2017 Iron Viz Global finalist. His passion is training, mentoring, and helping people gain insights and make decisions based on their data through data visualization using Tableau and data cleaning and structuring using Tableau Prep. He is a principal consultant at Teknion Data Solutions, where he has served clients in numerous industries since 2004.
Read more about Joshua N. Milligan