Reader small image

You're reading from  Data Modeling with Snowflake

Product typeBook
Published inMay 2023
PublisherPackt
ISBN-139781837634453
Edition1st Edition
Right arrow
Author (1)
Serge Gershkovich
Serge Gershkovich
author image
Serge Gershkovich

Serge Gershkovich is a seasoned data architect with decades of experience designing and maintaining enterprise-scale data warehouse platforms and reporting solutions. He is a leading subject matter expert, speaker, content creator, and Snowflake Data Superhero. Serge earned a bachelor of science degree in information systems from the State University of New York (SUNY) Stony Brook. Throughout his career, Serge has worked in model-driven development from SAP BW/HANA to dashboard design to cost-effective cloud analytics with Snowflake. He currently serves as product success lead at SqlDBM, an online database modeling tool.
Read more about Serge Gershkovich

Right arrow

Modeling Slowly Changing Dimensions

In Chapter 7, Putting Conceptual Modeling into Practice, we were introduced to database facts and dimensions. While facts capture the transactions of business operations, dimensions help give those transactions meaning by providing descriptive attributes, groupings, and other contextual details. Without careful curation and maintenance of dimension tables, databases would be like 1950s police dramas (just the facts, ma’am), lacking all color and making meaningful analysis impossible.

Dimensions shed light on the nature of entities in a data model, providing details such as a customer’s billing address or a product’s description. However, entity details are constantly in flux in the fast-paced business world—customers relocate, and products gain new features. A data warehouse must be able to keep up with the steady stream of changes and allow users to quickly pivot between the latest state of the world and a historical...

Technical requirements

The scripts used to instantiate and load the examples in this chapter are available in the following GitHub repository: https://github.com/PacktPublishing/Data-Modeling-with-Snowflake/tree/main/ch13. While the key section of each script will be explained in the latter half of this chapter, please refer to the repository for the complete code used to maintain and schedule the loading of the objects discussed, as it is too long to reprint here.

Dimensions overview

A dimension unifies (or conforms) similar attributes from one or various source systems into a single table under a common unique identifier known as a business key. A single surrogate key can also be used in place of multi-column business or primary keys. The unique key of a dimension table plays a critical role in identifying dimension records and allows the database team to track and maintain changes over time. A dimension table can be structured in predetermined ways to allow for different types of change tracking depending on the business requirement.

SCD types

Attributes within a dimension have differing requirements for durability and change tracking. Some attributes are updated directly, while others require historical snapshots, yet others cannot change at all. This section will cover the types of SCDs, or update profiles, that a given attribute in a dimension can have.

It’s important to note that the dimension type may not necessarily apply...

Recipes for maintaining SCDs in Snowflake

Understanding the structure of an SCD and being able to load it correctly are very different concepts. With a firm grasp of SCD types, we will now cook up the recipes for creating and maintaining them in Snowflake. Unlike generic SQL techniques you may have used in other databases, this book will take full advantage of the cost- and time-saving capabilities of Snowflake’s core features, such as streams and zero-copy cloning.

Setting the stage

To give readers complete autonomy to construct, experiment, and modify the upcoming exercises, we will first create a base table that will simulate the day one snapshot of the data warehouse raw/source schema. The base table will represent the initial first load of the source data into the data warehouse. Next, we construct a routine that simulates a daily load of new and changed records.

For consistency with the first half of this chapter, these examples will use the CUSTOMER table from...

Summary

Due to the constantly changing nature of master data in the source system, the data warehouse must serve two critical functions to allow business users to pivot between current and historical attribute values in their reporting. These functions consist of capturing source system changes in a landing area and creating SCDs that meet the organization’s reporting needs. Because master data plays such a key part in organizational analytics—often being tracked and scrutinized independently of fact records—learning to construct the required SCD structures and load them efficiently is a fundamental task for any data warehouse team.

In this chapter, we reviewed eight different SCD structures for meeting various analytical needs: from durable Type 0 attributes that never change to dynamic Type 7 configurations that can handle any requirement. Although many variations exist—even within SCD types—Types 1-3 are the most often used as they strike an...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Data Modeling with Snowflake
Published in: May 2023Publisher: PacktISBN-13: 9781837634453
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Author (1)

author image
Serge Gershkovich

Serge Gershkovich is a seasoned data architect with decades of experience designing and maintaining enterprise-scale data warehouse platforms and reporting solutions. He is a leading subject matter expert, speaker, content creator, and Snowflake Data Superhero. Serge earned a bachelor of science degree in information systems from the State University of New York (SUNY) Stony Brook. Throughout his career, Serge has worked in model-driven development from SAP BW/HANA to dashboard design to cost-effective cloud analytics with Snowflake. He currently serves as product success lead at SqlDBM, an online database modeling tool.
Read more about Serge Gershkovich