Reader small image

You're reading from  Snowflake Cookbook

Product typeBook
Published inFeb 2021
Reading LevelBeginner
PublisherPackt
ISBN-139781800560611
Edition1st Edition
Languages
Concepts
Right arrow
Authors (2):
Hamid Mahmood Qureshi
Hamid Mahmood Qureshi
author image
Hamid Mahmood Qureshi

Hamid Qureshi is a senior cloud and data warehouse professional with almost two decades of total experience, having architected, designed, and led the implementation of several data warehouse and business intelligence solutions. He has extensive experience and certifications across various data analytics platforms, ranging from Teradata, Oracle, and Hadoop to modern, cloud-based tools such as Snowflake. Having worked extensively with traditional technologies, combined with his knowledge of modern platforms, he has accumulated substantial practical expertise in data warehousing and analytics in Snowflake, which he has subsequently captured in his publications.
Read more about Hamid Mahmood Qureshi

Hammad Sharif
Hammad Sharif
author image
Hammad Sharif

Hammad Sharif is an experienced data architect with more than a decade of experience in the information domain, covering governance, warehousing, data lakes, streaming data, and machine learning. He has worked with a leading data warehouse vendor for a decade as part of a professional services organization, advising customers in telco, retail, life sciences, and financial industries located in Asia, Europe, and Australia during presales and post-sales implementation cycles. Hammad holds an MSc. in computer science and has published conference papers in the domains of machine learning, sensor networks, software engineering, and remote sensing.
Read more about Hammad Sharif

View More author details
Right arrow

Chapter 10: Extending Snowflake Capabilities

This chapter provides techniques for extending Snowflake capabilities and integrating a Snowflake-based data warehouse to work with other technologies. The chapter guides you through the creation of User-Defined Functions (UDFs), which can help introduce functionality that may currently be missing in Snowflake. It also talks about the two different languages that UDFs support and the two different modes in which UDFs can exist, and offers a guide on connecting Snowflake with Apache Spark for data exchanges between Snowflake and Spark. Then, the chapter explores scenarios involving externalizing data processing to a Spark engine, which could help support existing Spark pipelines in the customer's ecosystem or as a cost optimization technique for ETL.

The following recipes will be covered in this chapter:

  • Creating a Scalar user-defined function using SQL
  • Creating a Table user-defined function using SQL
  • Creating a Scalar...

Technical requirements

This chapter requires access to a modern internet browser (Chrome, Microsoft Edge, Firefox, and so on) and access to the internet to connect your Snowflake instance to the cloud.

The code for this chapter can be found at the following GitHub URL:

https://github.com/PacktPublishing/Snowflake-Cookbook/tree/master/Chapter10

Creating a Scalar user-defined function using SQL

Snowflake provides capabilities for creating UDFs that can extend the system and perform operations not available out of the box in Snowflake. Snowflake allows two languages for the creation of UDFs, SQL-based UDFs, and JavaScript-based UDFs. Either of the UDF types can return scalar or table results.

In this recipe, we will be walking you through the creation of SQL-based Scalar UDFs and demonstrating how to call the UDFs for various scenarios. A scalar UDF can return a single row of results consisting of a single column, which is essentially equivalent to returning a single value.

Getting ready

You will need to be connected to your Snowflake instance via the web UI or the SnowSQL client to execute this recipe.

How to do it...

We will be creating a series of UDFs to demonstrate the various SQL UDF capabilities provided by Snowflake. We will start by creating a relatively simple UDF and will slowly increase the complexity...

Creating a Table user-defined function using SQL

Snowflake provides capabilities for creating UDFs that can be used to extend the system and perform operations that are not available out of the box in Snowflake. Snowflake allows two languages for the creation of UDFs, SQL-based UDFs and JavaScript-based UDFs. Either of the UDF types can return scalar or table results.

In this recipe, we will be walking you through the creation of SQL-based Table UDFs and demonstrating how to call the UDFs in various scenarios. A Table UDF can return multiple rows of data.

Getting ready

You will need to be connected to your Snowflake instance via the Web UI or the SnowSQL client to execute this recipe.

How to do it

We will be creating a series of UDFs to demonstrate the SQL Table UDF capabilities provided by Snowflake. We will start by exploring some of the existing Table UDFs provided by Snowflake and reviewing their output:

  1. Let's start by creating a database in which...

Creating a Scalar user-defined function using JavaScript

In this recipe, we will be walking you through the creation of JavaScript-based Scalar UDFs and demonstrating the usage of JavaScript-based UDFs.

Getting ready

You will need to be connected to your Snowflake instance via the Web UI or the SnowSQL client to execute this recipe.

How to do it

We will be creating a series of UDFs to demonstrate the various JavaScript UDF capabilities provided by Snowflake. We will start by creating a simple UDF and will gradually increase the complexity of the UDFs that we create. The basic syntax for creating a JavaScript-based UDF is similar to the syntax for creating a SQL UDF, but with a slight difference. The difference is that if you do not specify the LANGUAGE attribute, a UDF will default to the SQL language. The syntax for a JavaScript-based UDF is as follows:

CREATE FUNCTION <name> ( [ <arg_name> <arg_data_type> ] )
RETURNS <result_data_type>
[ LANGUAGE...

Creating a Table user-defined function using JavaScript

Snowflake provides capabilities for creating UDFs that can be used to extend the system and perform operations not available out of the box in Snowflake. Snowflake allows two languages for the creation of UDFs, SQL-based UDFs and JavaScript-based UDFs. Either of the UDF types can return scalar or table results.

In this recipe, we will create JavaScript-based Table UDFs and demonstrate how to call the UDFs for various scenarios.

Getting ready

You will need to be connected to your Snowflake instance via the Web UI or the SnowSQL client to execute this recipe.

How to do it

We will be creating a series of UDFs to demonstrate the Table SQL UDF capabilities provided by Snowflake. We will start by exploring some of the existing Table UDFs provided by Snowflake and reviewing their output. The basic syntax for creating a JavaScript table UDF is as follows:

CREATE FUNCTION <name> ( [ <arg_name> <arg_data_type...

Connecting Snowflake with Apache Spark

Spark is a general-purpose data processing engine that can connect with different technologies. This recipe walks you through downloading the drivers and performing the configuration required to connect Spark with Snowflake. For simplicity, we will be performing the configuration on a standalone version of Spark running on an on-premise machine, but these instructions are portable to Spark clusters running on cloud platforms as well.

Getting ready

You will need to be connected to your Snowflake instance via the Web UI or the SnowSQL client to execute this recipe.

Note

This recipe assumes that Spark is installed and configured on your machine or that you have access to a cloud-based Spark installation.

How to do it

We will locate the JDBC driver and the Snowflake connector for Spark and use them to configure Spark connectivity with Snowflake:

  1. Our first step will be to identify the correct version of the Snowflake Connector...

Using Apache Spark to prepare data for storage on Snowflake

This recipe provides you with an example of how Apache Spark and Snowflake partner to utilize the two systems' strengths. The recipe shows a scenario involving reading data from Snowflake into a Spark DataFrame and writing data back to Snowflake from a Spark DataFrame.

Getting ready

You will need to be connected to your Snowflake instance via the Web UI or the SnowSQL client to execute this recipe.

It is assumed that you have already configured the Snowflake Connector for Spark and can connect to the Snowflake instance successfully through Spark.

How to do it

We will be reading data from Snowflake sample tables and transforming the data before writing it back to Snowflake in a new table. The following code in the various steps should be added into a single scala file called snowflake_transform.scala since we will be calling that file from within spark-shell:

  1. Let's start by creating a new database...

Why subscribe?

  • Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals
  • Improve your learning with Skill Plans built especially for you
  • Get a free eBook or video every month
  • Fully searchable for easy access to vital information
  • Copy and paste, print, and bookmark content

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at customercare@packtpub.com for more details.

At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Snowflake Cookbook
Published in: Feb 2021Publisher: PacktISBN-13: 9781800560611
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Hamid Mahmood Qureshi

Hamid Qureshi is a senior cloud and data warehouse professional with almost two decades of total experience, having architected, designed, and led the implementation of several data warehouse and business intelligence solutions. He has extensive experience and certifications across various data analytics platforms, ranging from Teradata, Oracle, and Hadoop to modern, cloud-based tools such as Snowflake. Having worked extensively with traditional technologies, combined with his knowledge of modern platforms, he has accumulated substantial practical expertise in data warehousing and analytics in Snowflake, which he has subsequently captured in his publications.
Read more about Hamid Mahmood Qureshi

author image
Hammad Sharif

Hammad Sharif is an experienced data architect with more than a decade of experience in the information domain, covering governance, warehousing, data lakes, streaming data, and machine learning. He has worked with a leading data warehouse vendor for a decade as part of a professional services organization, advising customers in telco, retail, life sciences, and financial industries located in Asia, Europe, and Australia during presales and post-sales implementation cycles. Hammad holds an MSc. in computer science and has published conference papers in the domains of machine learning, sensor networks, software engineering, and remote sensing.
Read more about Hammad Sharif