Reader small image

You're reading from  Data Modeling with Snowflake

Product typeBook
Published inMay 2023
PublisherPackt
ISBN-139781837634453
Edition1st Edition
Right arrow
Author (1)
Serge Gershkovich
Serge Gershkovich
author image
Serge Gershkovich

Serge Gershkovich is a seasoned data architect with decades of experience designing and maintaining enterprise-scale data warehouse platforms and reporting solutions. He is a leading subject matter expert, speaker, content creator, and Snowflake Data Superhero. Serge earned a bachelor of science degree in information systems from the State University of New York (SUNY) Stony Brook. Throughout his career, Serge has worked in model-driven development from SAP BW/HANA to dashboard design to cost-effective cloud analytics with Snowflake. He currently serves as product success lead at SqlDBM, an online database modeling tool.
Read more about Serge Gershkovich

Right arrow

Modeling Hierarchies

In the previous chapter, we learned about the methods and techniques for exploring semi-structured data in Snowflake. While not everyone works with operational web or app data such as JSON, there is another type of (semi-)structured data that exists across all organizations: hierarchies. Every company operates with hierarchical entities such as org levels (tiers of management and the reporting relationships between managers and employees) or calendar dimensions (rollups of days, months, fiscal periods, and years).

Whether formally maintained or naturally occurring, hierarchies are used within an organization to organize entities into meaningful groups and subgroups to facilitate rollups or drill-downs in data analysis. Besides aiding in the analysis of facts, hierarchies themselves can be examined to help organizations understand how they are structured, how they function, and how they can improve their performance by eliminating operational bottlenecks.

...

Technical requirements

The scripts used to instantiate and load the examples in this chapter are available in the following GitHub repo: https://github.com/PacktPublishing/Data-Modeling-with-Snowflake/tree/main/ch16. While key sections of this script will be highlighted in this chapter, please refer to the ch_16_hierarchies.sql file for the complete code required to load, query, and transform semi-structured data, as it is too long to reprint here in full.

Understanding and distinguishing between hierarchies

A hierarchy is a system in which people, objects, or concepts are organized into a tree-like structure, with each level representing a different category or grouping of data. In modeling, hierarchies can be thought of as a series of descending one-to-many relationships.

At the top of the hierarchy sits the root node, which contains child nodes. Each child node represents a subcategory of the data contained in the parent node and may, in turn, have its own child nodes. This arrangement of nodes and their relationships is often called a tree structure or diagram. If we were to proverbially chop down such a tree and observe it laterally, we would see the hierarchy in its relational form.

Figure 16.1 – A hierarchy seen in a tree (left) and relational format (right)

Figure 16.1 – A hierarchy seen in a tree (left) and relational format (right)

Hierarchies fall into three general categories depending on the variability in their levels. Let’s look at each of these categories...

Maintaining hierarchies in Snowflake

As we have seen, a hierarchy is a grouping of classifications and relationships. It is an adjacent but separate dimension to the entity that it organizes. As a standalone dimension, hierarchies are subject to the same changes in business rules as any master data. However, before we attend to change-tracking, we must get comfortable traversing a ragged hierarchy’s recursive relationships.

Once again, Snowflake has us covered with native features that allow us to easily map out the paths and levels of a ragged hierarchy.

Recursively navigating a ragged hierarchy

You have been put in charge of the pirate frigate Queen Anne’s Revenge. Its logbook shows all crew members, ranks, and direct reports. The details are listed as follows:

Figure 16.7 – A ragged hierarchy contained in the pirate dimension

Figure 16.7 – A ragged hierarchy contained in the pirate dimension

As you review the records, you realize that this list makes it difficult to make sense of the complex...

Summary

From how employees are organized to how customers or products are segmented, hierarchies exist in every organization. By learning how to model hierarchies as business dimensions, organizations can analyze and maintain them as they would any other entity. However, not all hierarchies behave the same way, and learning to understand their differences is the key to modeling them accordingly.

The biggest difference between hierarchies is the degree of variance between their members. Fixed and slightly ragged hierarchies have a set number of levels corresponding to their attributes’ natural grouping. Ragged hierarchies, such as org levels, can vary in depth and the number of branches for each node. Because of their tree-like structure, ragged hierarchies present a challenge for relational databases that rely on a columnar structure.

Using native extensions to standard SQL, Snowflake allows users to easily perform complex operations, such as recursively traversing a tree...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Data Modeling with Snowflake
Published in: May 2023Publisher: PacktISBN-13: 9781837634453
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Serge Gershkovich

Serge Gershkovich is a seasoned data architect with decades of experience designing and maintaining enterprise-scale data warehouse platforms and reporting solutions. He is a leading subject matter expert, speaker, content creator, and Snowflake Data Superhero. Serge earned a bachelor of science degree in information systems from the State University of New York (SUNY) Stony Brook. Throughout his career, Serge has worked in model-driven development from SAP BW/HANA to dashboard design to cost-effective cloud analytics with Snowflake. He currently serves as product success lead at SqlDBM, an online database modeling tool.
Read more about Serge Gershkovich