Reader small image

You're reading from  Data Modeling with Snowflake

Product typeBook
Published inMay 2023
PublisherPackt
ISBN-139781837634453
Edition1st Edition
Right arrow
Author (1)
Serge Gershkovich
Serge Gershkovich
author image
Serge Gershkovich

Serge Gershkovich is a seasoned data architect with decades of experience designing and maintaining enterprise-scale data warehouse platforms and reporting solutions. He is a leading subject matter expert, speaker, content creator, and Snowflake Data Superhero. Serge earned a bachelor of science degree in information systems from the State University of New York (SUNY) Stony Brook. Throughout his career, Serge has worked in model-driven development from SAP BW/HANA to dashboard design to cost-effective cloud analytics with Snowflake. He currently serves as product success lead at SqlDBM, an online database modeling tool.
Read more about Serge Gershkovich

Right arrow

Putting Physical Modeling into Practice

The modern data warehouse is a fast-paced environment. Multi-source and near-real-time data in Snowflake streams and transforms at the speed that scalable virtual hardware will allow. With potentially limitless computing resources available at trivially low prices, there emerges a tendency to undervalue planning in favor of post hoc adjustment. When this happens, platform and maintenance costs spiral, and suspicion is cast on the platform instead of the data model (or lack thereof).

So tempting is Snowflake’s promise of near-zero maintenance and effortless scalability that many take it as an excuse to perform adequate data modeling before diving in. The Snowflake data platform does indeed live up to expectations (and beyond) when the underlying data landscape is built on a pre-planned data model.

Compared to other data platforms, Snowflake handles much of the database administration on the user’s behalf. However, as this chapter...

Technical requirements

The data definition language (DDL) for the completed physical model created through the exercises in this chapter is available to download and use from the following Git repository: https://github.com/PacktPublishing/Data-Modeling-with-Snowflake/tree/main/ch11. You will arrive at the same result by following the steps in this chapter or by using the code provided. The physical model will be used as the foundation for transformational examples in later chapters.

Considerations before starting the implementation

When transitioning from a conceptual or logical design, where entities, attributes, relationships, and additional context have already been defined, there appears to be little to do at first glance when moving to a physical model. However, the specifics of Snowflake’s unique cloud architecture (discussed in Chapters 3 and 4), from its variable-spend pricing to time-travel data retention, leave several factors to consider before embarking on physical design. We’ll cover these factors in the following sections.

Performance

Query performance in Snowflake is heavily dependent on the clustering depth of the micro-partitions, which, in turn, are influenced by the natural sort order of the data inserted. Apart from Hybrid Unistore tables, which allow users to enable indexes, there are few performance tuning options left to the user besides sorting data before inserting and clustering. If the data volume in a given table...

Expanding from logical to physical modeling

At this stage in the modeling journey—preparing to transform a logical model into a physical one—the use of a modeling tool will make a marked difference in the effort required to generate the final DDL. While this exercise can be done using anything from a sheet of paper to Excel, using a data modeling tool to accelerate the process is encouraged. (See the Technical requirements section of Chapter 1, Unlocking the Power of Modeling for a link to a free trial of SqlDBM—the only cloud-based tool that supports Snowflake and offers a free tier.)

Picking up from the finished logical model from Chapter 8, Putting Logical Modeling into Practice, let’s begin the physical transformation.

Physicalizing the logical objects

Logical models contain all the information needed to transform them into a physical design, but they are not one-to-one equivalent regarding the number of elements. Besides the direct translations...

Deploying a physical model

At this point, all the tables, relationships, and properties have been defined and are ready to be deployed to Snowflake. If you use a modeling tool, all the DDL will be generated behind the scenes as adjustments are made to the diagram through a process called forward engineering. While it’s not strictly necessary to use a modeling tool to forward engineer, doing so will make it easier to make adjustments and generate valid, neatly formatted SQL for your data model.

For those following the exercise, the forward-engineered DDL from this exercise is available in a shared Git repository mentioned at the start of this chapter.

With the DDL in hand, pay attention to the database and schema context in the Snowflake UI. Creating a database or schema will automatically set the context for a given session. To switch to an existing database or schema, use the context menu in the UI or the USE <object> <object name> SQL expression. Here’...

Creating an ERD from a physical model

As we just demonstrated through the forward engineering deployment process, a physical database model is a one-to-one representation of its relational diagram. This implies that the process of generating a diagram can be run in reverse—from Snowflake DDL to a modeling tool—a process known as reverse engineering. Again, it’s not strictly necessary to use a dedicated modeling tool—many SQL IDEs such as Visual Studio Code and DBeaver can generate Entity-Relationship Diagrams (ERDs)—doing so will offer greater flexibility in organizing, navigating, and making adjustments to your model.

A similar diagram to the one created in the previous exercise can be generated by connecting to our deployed model through a SQL IDE:

Figure 11.4 – Reverse engineering in DBeaver IDE

Figure 11.4 – Reverse engineering in DBeaver IDE

What is evident in this exercise is often overlooked in database designs—the fact that a neat, related,...

Summary

The exercises in this chapter demonstrate what is required to transform a logical model into a deployable, physical design. However, before such transformation occurs, each project’s use case should be carefully considered. As there is no one-size-fits-all guideline for Snowflake databases, decisions must be made considering performance, cost, data integrity, security, and usability. However, unlike traditional databases, long-standing issues such as backup, recovery, and scalability are handled by Snowflake features and architecture.

Once the physical properties have been decided, users create physical equivalents of all logical objects, including many-to-many and subtype/supertype relationships, yielding a final set of physical tables. Following this, naming standards, database objects, columns, and their relationships are declared before deploying the resulting model.

Deployable Snowflake DDL code is produced from an ERD through a process called forward engineering...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Data Modeling with Snowflake
Published in: May 2023Publisher: PacktISBN-13: 9781837634453
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Author (1)

author image
Serge Gershkovich

Serge Gershkovich is a seasoned data architect with decades of experience designing and maintaining enterprise-scale data warehouse platforms and reporting solutions. He is a leading subject matter expert, speaker, content creator, and Snowflake Data Superhero. Serge earned a bachelor of science degree in information systems from the State University of New York (SUNY) Stony Brook. Throughout his career, Serge has worked in model-driven development from SAP BW/HANA to dashboard design to cost-effective cloud analytics with Snowflake. He currently serves as product success lead at SqlDBM, an online database modeling tool.
Read more about Serge Gershkovich