![Snowflake - Build and Architect Data Pipelines Using AWS [Video]](https://content.packt.com/V19222/cover_image_small.jpg)
Snowflake - Build and Architect Data Pipelines Using AWS [Video]
Subscription
FREE
Video + Subscription
$15.99
Video
$69.99
What do you get with a Packt Subscription?
What do you get with a Packt Subscription?
What do you get with Video + Subscription?
What do you get with a Packt Subscription?
What do you get with eBook?
What do I get with Print?
What do you get with video?
What do you get with Audiobook?
Subscription
FREE
Video + Subscription
$15.99
Video
$69.99
What do you get with a Packt Subscription?
What do you get with a Packt Subscription?
What do you get with Video + Subscription?
What do you get with a Packt Subscription?
What do you get with eBook?
What do I get with Print?
What do you get with video?
What do you get with Audiobook?
-
Free ChapterIntroduction to the Course
-
Introduction to Snowflake and AWS
-
Snowflake - Tables
-
Snowflake – Partitioning, Clustering, and Performance Optimization
- Section Overview
- Introduction to Partitions and Clustering Keys
- Lab – Micro-Partitions and Clustering Keys
- Benefits of Micro-Partitions and Clustering
- Understanding Clustering Depth and Cluster Overlap
- Lab - Selecting Your Clustering Keys
- Lab - Check Query Profile and History
- Lab - Query Processing and Caching
- Search Optimization Feature
-
Snowflake – Data Loading/Ingestion and Extraction
- Section Overview
- Data Ingestion – Real-World Use Cases
- Lab - Create an Integration Object to Connect Snowflake with AWS S3
- Lab - Ingest CSV from S3 to Snowflake
- Lab - Ingest JSON from S3 to Snowflake
- Introduction to Continuous Data Ingestion in Snowflake
- Lab - Create and Implement Snow Pipe
- Snow pipe - Billing Estimation and Key Considerations for Data Ingestion
- Lab - Extracting/Unload Data from Snowflake to S3
-
Snowflake – Tasks and Query Scheduling
-
Snowflake – Streams and Change Data Capture
- Section Overview
- Introduction to Streams
- Lab - Implement Standard Streams
- Lab - Implement Append-Only Streams
- Lab - Streams in a Transaction
- Streams - Data Retention and Staleness
- Lab - Change Tracking Using "Changes"
- Project Overview
- Lab - Create Streams - Project Solution
- Lab - Create Streams – Continuation
- Lab – End-to-End Pipeline in Action
-
Snowflake – User-Defined Functions
-
Snowflake – External Functions
-
Snowflake with Python, Spark, and Airflow on AWS
- Section Overview
- Lab - Connect Python with Snowflake in Your Local Machine
- Introduction to AWS Glue
- Lab - Deploy and Execute Python Script to AWS Glue
- Lab - Parameterize Your Python Script on AWS Glue
- Lab - Python Pandas with Snowflake on AWS Glue
- What Is Pushdown in Spark 3.1?
- Lab - Deploy a PySpark Script Using AWS Glue
- Lab – Set Up Managed Airflow Cluster on AWS
- Lab - Configure Snowflake Connectivity in Airflow
- Lab - Deploy a PySpark Transformation job in AWS Glue
- Lab – Set Up Airflow DAG
-
Real-Time Streaming with Kafka and Snowflake
-
Snowflake – Data Protection and Governance
-
Snowpark – For Data Pipelines and Data Science
- Introduction – What Is Snowpark?
- Lab - Getting Started with Snowpark
- Overview - UDFs and Store Procedures
- Lab - Deploy Python UDFs
- Lab - Deploy Stored Procedures for ETL Batch Processing
- Data Science – Use Case Overview and Data Preparation
- Lab - Deploy Model - Training Code for Scikit-Learn Using Stored Procedures
- Lab - Deploy Model Serving/Prediction Serving Pipeline Using UDFs
- More Learning Reference and Coupon Code
-
Wrap Up and More Learning
About this video
Snowflake is the next big thing, and it is becoming a full-blown data ecosystem. With the level of scalability and efficiency in handling massive volumes of data and also with several new concepts in it, this is the right time to wrap your head around Snowflake and have it in your toolkit. This course not only covers the core features of Snowflake but also teaches you how to deploy Python/PySpark jobs in AWS Glue and Airflow that communicate with Snowflake, which is one of the most important aspects of building pipelines.
In this course, you will look at Snowflake, and then the most crucial aspects of Snowflake in an efficient manner. You will be writing Python/Spark Jobs in AWS Glue Jobs for data transformation and seeing real-time streaming using Kafka and Snowflake. You will be interacting with external functions and use cases, and see the security features in Snowflake. Finally, you will look at Snowpark and explore how it can be used for data pipelines and data science.
By the end of this course, you will have learned about Snowflake and Snowpark, and learned how to build and architect data pipelines using AWS.
You need to have an active AWS account in order to perform the sections related to Python and PySpark. For the rest of the course, a free trial Snowflake account should suffice.
- Publication date:
- September 2022
- Publisher
- Packt
- Duration
- 8 hours 39 minutes
- ISBN
- 9781804615676