Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Data Modeling with Snowflake

You're reading from  Data Modeling with Snowflake

Product type Book
Published in May 2023
Publisher Packt
ISBN-13 9781837634453
Pages 324 pages
Edition 1st Edition
Languages
Author (1):
Serge Gershkovich Serge Gershkovich
Profile icon Serge Gershkovich

Table of Contents (24) Chapters

Preface 1. Part 1: Core Concepts in Data Modeling and Snowflake Architecture
2. Chapter 1: Unlocking the Power of Modeling 3. Chapter 2: An Introduction to the Four Modeling Types 4. Chapter 3: Mastering Snowflake’s Architecture 5. Chapter 4: Mastering Snowflake Objects 6. Chapter 5: Speaking Modeling through Snowflake Objects 7. Chapter 6: Seeing Snowflake’s Architecture through Modeling Notation 8. Part 2: Applied Modeling from Idea to Deployment
9. Chapter 7: Putting Conceptual Modeling into Practice 10. Chapter 8: Putting Logical Modeling into Practice 11. Chapter 9: Database Normalization 12. Chapter 10: Database Naming and Structure 13. Chapter 11: Putting Physical Modeling into Practice 14. Part 3: Solving Real-World Problems with Transformational Modeling
15. Chapter 12: Putting Transformational Modeling into Practice 16. Chapter 13: Modeling Slowly Changing Dimensions 17. Chapter 14: Modeling Facts for Rapid Analysis 18. Chapter 15: Modeling Semi-Structured Data 19. Chapter 16: Modeling Hierarchies 20. Chapter 17: Scaling Data Models through Modern Techniques 21. Index 22. Other Books You May Enjoy Appendix

Modeling Facts for Rapid Analysis

Fact tables are used to store the quantitative measurements of business operations or events such as sales, employee headcounts, or website traffic. Because they contain the official record of business transactions, fact tables are a prime target for operational analytics. Fact tables aggregate metrics such as sales totals and active users, as well as historical trends (deltas), such as the margin impact of daily returns or same-day bookings before cancelations.

Because business needs vary by industry and sector, various fact table models exist to fit these different demands. Facts such as product sales and returns are erratic, while others, such as manufacturing or fulfillment, follow a predictable pattern. The fact tables supporting such processes must anticipate not only the nature of the data they aim to capture but also the organization’s analytical needs to allow for efficient reporting.

This chapter will cover the various fact table...

Technical requirements

The scripts used to instantiate and load the examples in this chapter are available in the following GitHub repo: https://github.com/PacktPublishing/Data-Modeling-with-Snowflake/tree/main/ch14. While the key section of each script will be explained in the latter half of this chapter, please refer to the repo for the complete code used to maintain and schedule the loading of the objects discussed, as it is too long to reprint here.

Fact table types

By capturing the daily operational transactions of an organization, fact tables tend to contain large amounts of records that are constantly growing. By analyzing the data in fact tables, analysts and business users glean insights into business performance and identify trends and patterns. Considering these demands, fact tables must be designed in such a way that balances data loading efficiency with analytical needs and query patterns.

After nearly 20 years and three editions, the definitive guide to designing fact tables remains The Data Warehouse Toolkit. In it, authors Ralph Kimball and Margy Ross expertly cover the fundamentals of dimensional modeling, fact table design, and related industry case studies. This chapter will not attempt to replicate the content to which Kimball and Ross have dedicated nearly a quarter of their book. Instead, we will focus on what The Data Warehouse Toolkit does not cover: database-specific transformations for managing and maintaining...

Fact table measures

Numerical measures associated with a business transaction are called facts, and they fall into three basic categories:

  • Additive facts – These are measures that can be summed across any dimension. Additive facts are the most common type of fact in a DWH, allowing for a wide variety of analytical calculations and insights. These values can be aggregated across any combination of dimensions, such as time, geography, or product. Examples of additive facts include sales revenue, profit, and quantity sold.
  • Semi-additive facts – These are measures that can be summed across some dimensions but not all. These measures are usually numeric values that can only be aggregated across certain dimensions, such as customers or products. Examples of semi-additive facts include account balance and inventory levels, respectively. Semi-additive facts require special handling in data analysis to ensure that the aggregation is done correctly and does not spill...

Getting the facts straight

The facts in a source system are recorded in real time and updated in case of adjustments. By definition, they are always current. A DWH has a much harder task because it needs to capture and report current facts and track historical changes. Suppose an order was adjusted from containing two items to one. The DWH must find a way to report that a change was made while avoiding the issue of double-counting (as the total quantity is now one, not three).

The task of historical tracking is made even more complicated when the facts are not point-in-time transactions but intervals such as advertising campaigns or employee hires and leavers. In such cases, tabulating the cost of a department can no longer be accomplished by simple aggregation because employees can come and go at various intervals.

Operating a business is messy and complex, and the data that it generates is no exception. Employees come and go, orders are returned, and in some cases, records...

Maintaining fact tables using Snowflake features

In this section, we will practice creating and maintaining the fact table techniques discussed previously using available Snowflake features. Like in previous chapters, we will load data from the snowflake_sample_data.tpch_sf10 schema, which will serve as the sample set. We will then simulate source system updates by randomly loading records from this sample.

The first two exercises will use data from the LINEITEM table. To continue, create a schema to house the exercises from this chapter and instantiate the source system table, as well as the DWH landing area.

Open the first file in this chapter’s repository, ch_14.1_reverse_balance_fact.sql, and run the first three steps:

CREATE OR REPLACE SCHEMA ch14_facts;
CREATE OR REPLACE TABLE source_system_lineitem ...;
CREATE OR REPLACE TABLE src_lineitem ...;

These examples require lineitem orders to be loaded and processed in their entirety (containing all constituent...

Summary

In a DWH, fact tables present an additional challenge on top of merely capturing the latest values – they must also be able to capture and reconcile historical changes in a way that allows users to flexibly and cost-effectively query them to resolve business questions, because when it comes to operational analytics, analyzing changes, variations, and what didn’t happen can be just as valuable as the current state of truth.

Various types of fact tables exist to help an organization meet these demanding analytical needs, such as transactional, snapshot, and accumulating snapshot fact tables, among others. These fact tables must differentiate between the various kinds of measures they store (e.g., additive, semi-additive, and non-additive) because each is treated differently when updating or recording changes.

To help data teams construct and maintain these tables in Snowflake, this chapter dissected some of the toughest challenges in maintaining fact tables...

lock icon The rest of the chapter is locked
You have been reading a chapter from
Data Modeling with Snowflake
Published in: May 2023 Publisher: Packt ISBN-13: 9781837634453
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at ₹800/month. Cancel anytime}