You're reading from Mastering Tableau 2023 - Fourth Edition

Product type Book

Published in Aug 2023

Publisher Packt

ISBN-13 9781803233765

Pages 684 pages

Edition 4th Edition

Languages

Concepts

Business Intelligence

Author (1):

Marleen Meier

Table of Contents (19) Chapters

Preface

1. Reviewing the Basics

2. Getting Your Data Ready

3. Using Tableau Prep Builder

4. Learning about Joins, Blends, and Data Structures

5. Introducing Table Calculations

6. Utilizing OData, Data Densification, Big Data, and Google BigQuery

7. Practicing Level of Detail Calculations

8. Going Beyond the Basics

9. Working with Maps

10. Presenting with Tableau

11. Designing Dashboards and Best Practices for Visualizations

12. Leveraging Advanced Analytics

13. Improving Performance

14. Exploring Tableau Server and Tableau Cloud

15. Integrating Programming Languages

16. Developing Data Governance Practices

17. Other Books You May Enjoy

18. Index

Understanding Hyper

In this section, we will explore Tableau’s data-handling engine, and how it enables structured yet organic data mining processes in enterprises. Since the release of Tableau 10.5, we can make use of Hyper, a high-performing database, allowing us to query source data faster than ever before. Hyper is usually not well understood, even by advanced developers, because it’s not an overt part of day-to-day activities; however, if you want to truly grasp how to prepare data for Tableau, this understanding is crucial.

Hyper originally started as a research project at the University of Munich in 2008. In 2016, it was acquired by Tableau and appointed as the dedicated data engine group of Tableau, maintaining its base and employees in Munich. Initially in Tableau 10.5, Hyper replaced the earlier data-handling engine only for extracts. It is still true that live connections are not touched by Hyper, but Tableau Prep Builder now runs on the Hyper engine too, with more use cases to follow. As stated on tableau.com, “Hyper can slice and dice massive volumes of data in seconds, you will see up to 5X faster query speed and up to 3X faster extract creation speed.” And if you still can’t get enough, there is always the option to use Hyper through API calls in your preferred programming language: https://help.tableau.com/current/api/hyper_api/en-us/docs/hyper_api_reference.html.

But what makes Hyper so fast? Let’s have a look under the hood!

The Tableau data-handling engine

The vision shared by the founders of Hyper was to create a high-performing, next-generation database—one system, one state, no trade-offs, and no delays. And it worked—today, Hyper can serve general database purposes, data ingestion, and analytics at the same time.

Memory prices have decreased exponentially. The same goes for CPUs; transistor counts increased according to Moore’s law, while other features stagnated. Memory is cheap but processing still needs to be improved.

Moore’s Law is the observation made by Intel co-founder Gordon Moore that the number of transistors on a chip doubles every two years while the costs are halved. Information on Moore’s Law can be found on Investopedia at https://www.investopedia.com/terms/m/mooreslaw.asp.

While experimenting with Hyper, the founders measured that handwritten C code is faster than any existing database engine, so they came up with the idea to transform Tableau queries into C code and optimize it simultaneously, all behind the scenes, so the Tableau user won’t notice it. This translation and optimization come at a cost; traditional database engines can start executing code immediately. Tableau needs to first translate queries into code, optimize that code, then compile it into machine code, after which it can be executed. The big question is, is it still faster? As proven by many tests on Tableau Public and other workbooks, the answer is yes!

Furthermore, if there is a query estimated to be faster if executed without the compilation to machine code, Tableau has its own virtual machine (VM) on which the query will be executed right away. And next to this, Hyper can utilize 99% of available CPU computing power, whereas other parallel processes can only utilize 29% of available CPU compute. This is due to the unique and innovative technique of morsel-driven parallelization.

For those of you that want to know more about morsel-driven parallelization, a paper, which later on served as a baseline for the Hyper engine, can be found at https://15721.courses.cs.cmu.edu/spring2016/papers/p743-leis.pdf.

If you want to know more about the Hyper engine, I highly recommend the following video at https://youtu.be/h2av4CX0k6s.

Hyper parallelizes three steps of traditional data warehousing operations:

Transactions and Continuous Data Ingestion (Online Transaction Processing, or OLTP)
Analytics (Online Analytical Processing, or OLAP)
Beyond Relational (Online Beyond Relational Processing, or OBRP)

Executing those steps simultaneously makes Hyper more efficient and more performant, as opposed to traditional systems where those three steps are separated and executed one after the other.

To sum up, Hyper is a highly specialized database engine that allows us as users to get the best out of our queries. If you recall, in Chapter 1, Reviewing the Basics, we already saw that every change on a sheet or dashboard, including drag and drop pills, filters, and calculated fields, among others, is translated into a query. Those queries are pretty much SQL lookalikes; however, in Tableau we call the querying engine VizQL.

VizQL, another hidden gem on your Tableau Desktop, is responsible for visualizing data in a chart format and is fully executed in memory. The advantage is that no additional space on the database side is required here. VizQL is generated when a user places a field on a shelf. VizQL is then translated into SQL, MDX, or Tableau Query Language (TQL) and passed to the backend data source with a driver.

Hyper takeaways

This overview of the Tableau data-handling engine demonstrates a flexible approach to interfacing with data. Knowledge of the data-handling engine can reduce data preparation and data modeling efforts, thus helping us streamline the overall data mining life cycle. Don’t worry too much about data types and data that can be calculated based on the fields you have in your database. Tableau can do all the work for you in this respect. In the next section, we will discuss what you should consider from a data source perspective.

You're reading from Mastering Tableau 2023 - Fourth Edition

Table of Contents (19) Chapters

Understanding Hyper

The Tableau data-handling engine

Hyper takeaways

Authors (1)

Personalised recommendations for you