You're reading from Mastering Tableau 2023 - Fourth Edition

Product typeBook

Published inAug 2023

PublisherPackt

ISBN-139781803233765

Edition4th Edition

Concepts

Business Intelligence

Author (1)

Marleen Meier

Understanding Hyper

In this section, we will explore Tableau’s data-handling engine, and how it enables structured yet organic data mining processes in enterprises. Since the release of Tableau 10.5, we can make use of Hyper, a high-performing database, allowing us to query source data faster than ever before. Hyper is usually not well understood, even by advanced developers, because it’s not an overt part of day-to-day activities; however, if you want to truly grasp how to prepare data for Tableau, this understanding is crucial.

Hyper originally started as a research project at the University of Munich in 2008. In 2016, it was acquired by Tableau and appointed as the dedicated data engine group of Tableau, maintaining its base and employees in Munich. Initially in Tableau 10.5, Hyper replaced the earlier data-handling engine only for extracts. It is still true that live connections are not touched by Hyper, but Tableau Prep Builder now runs on the Hyper engine too, with more use cases to follow. As stated on tableau.com, “Hyper can slice and dice massive volumes of data in seconds, you will see up to 5X faster query speed and up to 3X faster extract creation speed.” And if you still can’t get enough, there is always the option to use Hyper through API calls in your preferred programming language: https://help.tableau.com/current/api/hyper_api/en-us/docs/hyper_api_reference.html.

But what makes Hyper so fast? Let’s have a look under the hood!

The Tableau data-handling engine

The vision shared by the founders of Hyper was to create a high-performing, next-generation database—one system, one state, no trade-offs, and no delays. And it worked—today, Hyper can serve general database purposes, data ingestion, and analytics at the same time.

Memory prices have decreased exponentially. The same goes for CPUs; transistor counts increased according to Moore’s law, while other features stagnated. Memory is cheap but processing still needs to be improved.

Moore’s Law is the observation made by Intel co-founder Gordon Moore that the number of transistors on a chip doubles every two years while the costs are halved. Information on Moore’s Law can be found on Investopedia at https://www.investopedia.com/terms/m/mooreslaw.asp.

While experimenting with Hyper, the founders measured that handwritten C code is faster than any existing database engine, so they came up with the idea to transform Tableau queries into C code and optimize it simultaneously, all behind the scenes, so the Tableau user won’t notice it. This translation and optimization come at a cost; traditional database engines can start executing code immediately. Tableau needs to first translate queries into code, optimize that code, then compile it into machine code, after which it can be executed. The big question is, is it still faster? As proven by many tests on Tableau Public and other workbooks, the answer is yes!

Furthermore, if there is a query estimated to be faster if executed without the compilation to machine code, Tableau has its own virtual machine (VM) on which the query will be executed right away. And next to this, Hyper can utilize 99% of available CPU computing power, whereas other parallel processes can only utilize 29% of available CPU compute. This is due to the unique and innovative technique of morsel-driven parallelization.

For those of you that want to know more about morsel-driven parallelization, a paper, which later on served as a baseline for the Hyper engine, can be found at https://15721.courses.cs.cmu.edu/spring2016/papers/p743-leis.pdf.

If you want to know more about the Hyper engine, I highly recommend the following video at https://youtu.be/h2av4CX0k6s.

Hyper parallelizes three steps of traditional data warehousing operations:

Transactions and Continuous Data Ingestion (Online Transaction Processing, or OLTP)
Analytics (Online Analytical Processing, or OLAP)
Beyond Relational (Online Beyond Relational Processing, or OBRP)

Executing those steps simultaneously makes Hyper more efficient and more performant, as opposed to traditional systems where those three steps are separated and executed one after the other.

To sum up, Hyper is a highly specialized database engine that allows us as users to get the best out of our queries. If you recall, in Chapter 1, Reviewing the Basics, we already saw that every change on a sheet or dashboard, including drag and drop pills, filters, and calculated fields, among others, is translated into a query. Those queries are pretty much SQL lookalikes; however, in Tableau we call the querying engine VizQL.

VizQL, another hidden gem on your Tableau Desktop, is responsible for visualizing data in a chart format and is fully executed in memory. The advantage is that no additional space on the database side is required here. VizQL is generated when a user places a field on a shelf. VizQL is then translated into SQL, MDX, or Tableau Query Language (TQL) and passed to the backend data source with a driver.

Hyper takeaways

This overview of the Tableau data-handling engine demonstrates a flexible approach to interfacing with data. Knowledge of the data-handling engine can reduce data preparation and data modeling efforts, thus helping us streamline the overall data mining life cycle. Don’t worry too much about data types and data that can be calculated based on the fields you have in your database. Tableau can do all the work for you in this respect. In the next section, we will discuss what you should consider from a data source perspective.

You have been reading a chapter from

Mastering Tableau 2023 - Fourth Edition

Published in: Aug 2023Publisher: PacktISBN-13: 9781803233765

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Marleen Meier

Marleen Meier is an accomplished analyst and author with a passion for statistics and data. By using traditional methodologies and approaches such as Machine Learning and AI, Marleen is dedicated to driving meaningful insights. Currently working as the APAC Data CoE Lead for ABN AMRO Clearing, Marleen is at the forefront of innovation and implementing data-driven strategies in a global financial environment. She has lived and worked in multiple countries, including Germany, the Netherlands, the USA, and Singapore, allowing her to bring a diverse and global perspective to her work. Through her writing and speaking engagements, she aims to empower individuals and organizations to unlock the full potential of their data assets.
Read more about Marleen Meier

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages