Reader small image

You're reading from  Snowflake Cookbook

Product typeBook
Published inFeb 2021
Reading LevelBeginner
PublisherPackt
ISBN-139781800560611
Edition1st Edition
Languages
Concepts
Right arrow
Authors (2):
Hamid Mahmood Qureshi
Hamid Mahmood Qureshi
author image
Hamid Mahmood Qureshi

Hamid Qureshi is a senior cloud and data warehouse professional with almost two decades of total experience, having architected, designed, and led the implementation of several data warehouse and business intelligence solutions. He has extensive experience and certifications across various data analytics platforms, ranging from Teradata, Oracle, and Hadoop to modern, cloud-based tools such as Snowflake. Having worked extensively with traditional technologies, combined with his knowledge of modern platforms, he has accumulated substantial practical expertise in data warehousing and analytics in Snowflake, which he has subsequently captured in his publications.
Read more about Hamid Mahmood Qureshi

Hammad Sharif
Hammad Sharif
author image
Hammad Sharif

Hammad Sharif is an experienced data architect with more than a decade of experience in the information domain, covering governance, warehousing, data lakes, streaming data, and machine learning. He has worked with a leading data warehouse vendor for a decade as part of a professional services organization, advising customers in telco, retail, life sciences, and financial industries located in Asia, Europe, and Australia during presales and post-sales implementation cycles. Hammad holds an MSc. in computer science and has published conference papers in the domains of machine learning, sensor networks, software engineering, and remote sensing.
Read more about Hammad Sharif

View More author details
Right arrow

Unique counts and Snowflake

Estimating unique counts for rows in distributed systems is a compute-intensive process. Snowflake uses a distributed algorithm that is state of the art. This technique is different from those used in other databases. It is faster but, at the same time, approximate. The recipe shall show you how to use the uniqueness functions that are available in Snowflake.

Getting ready

Note that this recipe's steps can be run either in the Snowflake WebUI or the SnowSQL command-line client. An extra small warehouse will be used in the recipe.

How to do it…

As part of this recipe, we shall explore the different count functions/capabilities of Snowflake. We shall start with the typical count and distinct count functions and how different combinations can yield different results. Then, we shall explore the HyperLogLog algorithm implementation in Snowflake, which can efficiently approximate count over groups. This is recommended for use cases where...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Snowflake Cookbook
Published in: Feb 2021Publisher: PacktISBN-13: 9781800560611

Authors (2)

author image
Hamid Mahmood Qureshi

Hamid Qureshi is a senior cloud and data warehouse professional with almost two decades of total experience, having architected, designed, and led the implementation of several data warehouse and business intelligence solutions. He has extensive experience and certifications across various data analytics platforms, ranging from Teradata, Oracle, and Hadoop to modern, cloud-based tools such as Snowflake. Having worked extensively with traditional technologies, combined with his knowledge of modern platforms, he has accumulated substantial practical expertise in data warehousing and analytics in Snowflake, which he has subsequently captured in his publications.
Read more about Hamid Mahmood Qureshi

author image
Hammad Sharif

Hammad Sharif is an experienced data architect with more than a decade of experience in the information domain, covering governance, warehousing, data lakes, streaming data, and machine learning. He has worked with a leading data warehouse vendor for a decade as part of a professional services organization, advising customers in telco, retail, life sciences, and financial industries located in Asia, Europe, and Australia during presales and post-sales implementation cycles. Hammad holds an MSc. in computer science and has published conference papers in the domains of machine learning, sensor networks, software engineering, and remote sensing.
Read more about Hammad Sharif