Reader small image

You're reading from  Mastering Tableau 2023 - Fourth Edition

Product typeBook
Published inAug 2023
PublisherPackt
ISBN-139781803233765
Edition4th Edition
Right arrow
Author (1)
Marleen Meier
Marleen Meier
author image
Marleen Meier

Marleen Meier is an accomplished analyst and author with a passion for statistics and data. By using traditional methodologies and approaches such as Machine Learning and AI, Marleen is dedicated to driving meaningful insights. Currently working as the APAC Data CoE Lead for ABN AMRO Clearing, Marleen is at the forefront of innovation and implementing data-driven strategies in a global financial environment. She has lived and worked in multiple countries, including Germany, the Netherlands, the USA, and Singapore, allowing her to bring a diverse and global perspective to her work. Through her writing and speaking engagements, she aims to empower individuals and organizations to unlock the full potential of their data assets.
Read more about Marleen Meier

Right arrow

Learning about Joins, Blends, and Data Structures

Connecting Tableau to data often means more than connecting to a single table in a single data source. You may need to use Tableau to join multiple tables from a single data source. For this purpose, we can use joins, which combine a dataset row with another dataset’s row if a given key value matches. You can also join tables from disparate data sources or union data with a similar metadata structure.

Sometimes, you may need to merge data that does not share a common row-level key, meaning if you were to match two datasets on a row level like in a join, you would duplicate data because the row data in one dataset is of much greater detail (for example, cities) than the other dataset (which might contain countries). In such cases, you will need to blend the data. This functionality allows you to, for example, show the count of cities per country without changing the city dataset to a country level.

Also, you may find...

Relationships

Although this chapter will primarily focus on joins, blends, and the manipulation of data structures, let’s begin with an introduction to relationships: a new functionality available since Tableau 2020.2, and one that the Tableau community has been waiting a long time for. It is the new default option in the data canvas; therefore, we will first investigate relationships, which belong to the logical layer of the data model, before diving deeper into the join and union functionalities that operate on the physical layer.

To read all about the physical and logical layers of Tableau’s data model, visit the Tableau help pages: https://help.tableau.com/current/online/en-us/datasource_datamodel.htm.

For now, you can think of the logical layer as being able to count smarter. Imagine a customer table with unique customer names and an order table with hundreds of recorded orders. In a logical layer, a customer name with multiple orders (thus multiple...

Joins

This book assumes basic knowledge of joins, specifically inner, left-outer, right-outer, and full-outer joins. If you are not familiar with the basics of joins, consider taking W3Schools’ SQL tutorial at https://www.w3schools.com/sql/default.asp. The basics are not difficult, so it won’t take you long to get up to speed.

The following screenshot shows three related datasets:

Figure 4.4: Related datasets

To better understand what Tableau does to your data when joining multiple tables, let us explore join queries.

Join queries

The following screenshot is a representation of the joined datasets in Figure 4.4:

Figure 4.5: Join culling

The preceding screenshot represents an inner join between the Orders and Returns tables, connected through an inner join on the common key, Order ID. Orders and People are joined on the common key Region.

To better understand what has just been described, we will continue with a join culling exercise...

Unions

Sometimes you might want to analyze data with the same metadata structure that is stored in different files – for example, sales data from multiple years, different months, or countries. Instead of copying and pasting the data, you can union it. We already touched upon this topic in Chapter 3, Using Tableau Prep Builder, but a union is basically where Tableau will append new rows of data to existing columns with the same header. For the following exercise, we will use FIFA data (from the PlayStation game, not the World Cup). The data is from Kaggle (https://www.kaggle.com/datasets/stefanoleone992/fifa-22-complete-player-dataset?resource=download) and ships in multiple CSVs; each CSV contains data for one year and male/female are split too.

For our analysis, we want to combine all the files into one. Hence, we need to union, by taking the following steps:

  1. Download the CSV files from GitHub (https://github.com/PacktPublishing/Mastering-Tableau-2023-Fourth...

Blends

Relationships make data blending a little less needed and it can be seen as legacy functionality. But for the sake of completeness and for older Tableau versions (below 2020.2), let’s consider a summary of data blending in the following sections. In a nutshell, data blending allows you to merge multiple disparate data sources into a single view. Understanding the following four points will give you a grasp of the main points regarding data blending:

  • Data blending is typically used to merge data from multiple data sources. Although as of Tableau 10, joins are possible between multiple data sources, there are still cases when data blending is the only feasible option to merge data from two or more sources. In the following sections, we will see a practical example that demonstrates such a case.
  • Data blending requires a shared dimension. A date dimension is often a viable candidate for blending multiple data sources.
  • Data blending aggregates and then...

Understanding data structures

The right data structure is not easily definable. True, there are ground rules. For instance, tall data is generally better than wide data. A wide dataset with lots of columns can be difficult to work with, whereas the same data structured in a tall format with fewer columns but more rows is usually easier to work with.

But this is not always the case! Some business questions are more easily answered with wide data structures. And that is the crux of the matter. Business questions determine the right data structure. If one structure answers all questions, great! However, your questions may require multiple data structures. The pivot feature in Tableau helps you adjust data structures on the fly to answer different business questions.

Before beginning this exercise, make sure you understand the following points:

  • Pivoting in Tableau is limited to Excel, text files, and Google Sheets; otherwise, you must use Custom SQL or Tableau Prep
  • ...

Summary

We began this chapter with an introduction to relationships, followed by a discussion on joins, and discovered the queries Tableau uses to generate the respective data. Unions come in handy if identically formatted data, stored in multiple sheets or data sources, needs to be appended.

Then, we reviewed data blending to clearly understand how it differs from joining. We discovered that the primary limitation in data blending is that no dimensions are allowed from a secondary source; however, we also discovered that there are exceptions to this rule. We also discussed scaffolding, which can make data blending surprisingly fruitful.

Finally, we discussed data structures and learned how pivoting can make difficult or impossible visualizations easy. Having completed our second data-centric discussion, in the next chapter, we will discuss table calculations, partitioning, and addressing.

Learn more on Discord

To join the Discord community for this book – where...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Mastering Tableau 2023 - Fourth Edition
Published in: Aug 2023Publisher: PacktISBN-13: 9781803233765
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Marleen Meier

Marleen Meier is an accomplished analyst and author with a passion for statistics and data. By using traditional methodologies and approaches such as Machine Learning and AI, Marleen is dedicated to driving meaningful insights. Currently working as the APAC Data CoE Lead for ABN AMRO Clearing, Marleen is at the forefront of innovation and implementing data-driven strategies in a global financial environment. She has lived and worked in multiple countries, including Germany, the Netherlands, the USA, and Singapore, allowing her to bring a diverse and global perspective to her work. Through her writing and speaking engagements, she aims to empower individuals and organizations to unlock the full potential of their data assets.
Read more about Marleen Meier