You're reading from Data Modeling with Snowflake

Product typeBook

Published inMay 2023

PublisherPackt

ISBN-139781837634453

Edition1st Edition

Concepts

Data Engineering

Author (1)

Serge Gershkovich

Putting Physical Modeling into Practice

The modern data warehouse is a fast-paced environment. Multi-source and near-real-time data in Snowflake streams and transforms at the speed that scalable virtual hardware will allow. With potentially limitless computing resources available at trivially low prices, there emerges a tendency to undervalue planning in favor of post hoc adjustment. When this happens, platform and maintenance costs spiral, and suspicion is cast on the platform instead of the data model (or lack thereof).

So tempting is Snowflake’s promise of near-zero maintenance and effortless scalability that many take it as an excuse to perform adequate data modeling before diving in. The Snowflake data platform does indeed live up to expectations (and beyond) when the underlying data landscape is built on a pre-planned data model.

Compared to other data platforms, Snowflake handles much of the database administration on the user’s behalf. However, as this chapter...

Technical requirements

The data definition language (DDL) for the completed physical model created through the exercises in this chapter is available to download and use from the following Git repository: https://github.com/PacktPublishing/Data-Modeling-with-Snowflake/tree/main/ch11. You will arrive at the same result by following the steps in this chapter or by using the code provided. The physical model will be used as the foundation for transformational examples in later chapters.

Considerations before starting the implementation

When transitioning from a conceptual or logical design, where entities, attributes, relationships, and additional context have already been defined, there appears to be little to do at first glance when moving to a physical model. However, the specifics of Snowflake’s unique cloud architecture (discussed in Chapters 3 and 4), from its variable-spend pricing to time-travel data retention, leave several factors to consider before embarking on physical design. We’ll cover these factors in the following sections.

Performance

Query performance in Snowflake is heavily dependent on the clustering depth of the micro-partitions, which, in turn, are influenced by the natural sort order of the data inserted. Apart from Hybrid Unistore tables, which allow users to enable indexes, there are few performance tuning options left to the user besides sorting data before inserting and clustering. If the data volume in a given table...

Expanding from logical to physical modeling

At this stage in the modeling journey—preparing to transform a logical model into a physical one—the use of a modeling tool will make a marked difference in the effort required to generate the final DDL. While this exercise can be done using anything from a sheet of paper to Excel, using a data modeling tool to accelerate the process is encouraged. (See the Technical requirements section of Chapter 1, Unlocking the Power of Modeling for a link to a free trial of SqlDBM—the only cloud-based tool that supports Snowflake and offers a free tier.)

Picking up from the finished logical model from Chapter 8, Putting Logical Modeling into Practice, let’s begin the physical transformation.

Physicalizing the logical objects

Logical models contain all the information needed to transform them into a physical design, but they are not one-to-one equivalent regarding the number of elements. Besides the direct translations...

Deploying a physical model

At this point, all the tables, relationships, and properties have been defined and are ready to be deployed to Snowflake. If you use a modeling tool, all the DDL will be generated behind the scenes as adjustments are made to the diagram through a process called forward engineering. While it’s not strictly necessary to use a modeling tool to forward engineer, doing so will make it easier to make adjustments and generate valid, neatly formatted SQL for your data model.

For those following the exercise, the forward-engineered DDL from this exercise is available in a shared Git repository mentioned at the start of this chapter.

With the DDL in hand, pay attention to the database and schema context in the Snowflake UI. Creating a database or schema will automatically set the context for a given session. To switch to an existing database or schema, use the context menu in the UI or the USE <object> <object name> SQL expression. Here’...

Creating an ERD from a physical model

As we just demonstrated through the forward engineering deployment process, a physical database model is a one-to-one representation of its relational diagram. This implies that the process of generating a diagram can be run in reverse—from Snowflake DDL to a modeling tool—a process known as reverse engineering. Again, it’s not strictly necessary to use a dedicated modeling tool—many SQL IDEs such as Visual Studio Code and DBeaver can generate Entity-Relationship Diagrams (ERDs)—doing so will offer greater flexibility in organizing, navigating, and making adjustments to your model.

A similar diagram to the one created in the previous exercise can be generated by connecting to our deployed model through a SQL IDE:

Figure 11.4 – Reverse engineering in DBeaver IDE

What is evident in this exercise is often overlooked in database designs—the fact that a neat, related,...

Summary

The exercises in this chapter demonstrate what is required to transform a logical model into a deployable, physical design. However, before such transformation occurs, each project’s use case should be carefully considered. As there is no one-size-fits-all guideline for Snowflake databases, decisions must be made considering performance, cost, data integrity, security, and usability. However, unlike traditional databases, long-standing issues such as backup, recovery, and scalability are handled by Snowflake features and architecture.

Once the physical properties have been decided, users create physical equivalents of all logical objects, including many-to-many and subtype/supertype relationships, yielding a final set of physical tables. Following this, naming standards, database objects, columns, and their relationships are declared before deploying the resulting model.

Deployable Snowflake DDL code is produced from an ERD through a process called forward engineering...

The rest of the chapter is locked

You have been reading a chapter from

Data Modeling with Snowflake

Published in: May 2023Publisher: PacktISBN-13: 9781837634453

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €14.99/month. Cancel anytime

Author (1)

Serge Gershkovich

Serge Gershkovich is a seasoned data architect with decades of experience designing and maintaining enterprise-scale data warehouse platforms and reporting solutions. He is a leading subject matter expert, speaker, content creator, and Snowflake Data Superhero. Serge earned a bachelor of science degree in information systems from the State University of New York (SUNY) Stony Brook. Throughout his career, Serge has worked in model-driven development from SAP BW/HANA to dashboard design to cost-effective cloud analytics with Snowflake. He currently serves as product success lead at SqlDBM, an online database modeling tool.
Read more about Serge Gershkovich

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages