Reader small image

You're reading from  Limitless Analytics with Azure Synapse

Product typeBook
Published inJun 2021
Reading LevelBeginner
PublisherPackt
ISBN-139781800205659
Edition1st Edition
Languages
Concepts
Right arrow
Author (1)
Prashant Kumar Mishra
Prashant Kumar Mishra
author image
Prashant Kumar Mishra

Prashant Kumar Mishra is an engineering architect at Microsoft. He has more than 10 years of professional expertise in the Microsoft data and AI segment as a developer, consultant, and architect. He has been focused on Microsoft Azure Cloud technologies for several years now and has helped various customers in their data journey. He prefers to share his knowledge with others to make the data community stronger day by day through his blogs and meetup groups.
Read more about Prashant Kumar Mishra

Right arrow

Chapter 6: Working with T-SQL in Azure Synapse

Azure Synapse Structured Query Language (SQL) enables you to query your data using the Transact-SQL (T-SQL) language, which means you do not need to learn any new languages if you already have prior experience working with SQL. As we now know, Azure Synapse SQL supports two types of consumption models, dedicated and serverless, and you will notice some differences in the supported features of both models. In this chapter, we are going to cover T-SQL language elements that are supported in Synapse SQL pools. We will also learn how we can create stored procedures and views in Synapse SQL pools. As with SQL Server and Azure SQL, we will learn which system views are supported in a Synapse SQL pool.

This chapter will help you get familiar with the features supported in Azure Synapse SQL. We will learn how to use T-SQL queries on unstructured data as well.

We are going to cover the following topics in this chapter, which will help you...

Technical requirements

Before you start orchestrating your data, here are certain prerequisites that you should meet:

  • You should have an Azure subscription, or access to any other subscription with contributor-level access.
  • Create your Synapse workspace on this subscription. You can follow the instructions from Chapter 1, Introduction to Azure Synapse, to create your Synapse workspace.
  • Create a SQL pool on Azure Synapse. This has been covered in Chapter 2, Considerations For Your Compute Environment.
  • Download the script from the following link: http://bit.ly/T-SQL-samples.

Once you have met all the prerequisites, you can start transforming your business logic into code by using T-SQL. In the following section, we will learn about some of the supported features by using some sample queries.

Supporting T-SQL language elements in a Synapse SQL pool

The SELECT statement in T-SQL retrieves rows from a database and enables the selection of columns and rows from one or multiple tables in Azure Synapse SQL. You can use the SELECT statement with WHERE, GROUP BY, HAVING, and ORDER BY clauses in dedicated and serverless SQL pools. The syntax for the SELECT statement in Synapse SQL is similar to that found in Azure SQL Database or SQL Server.

The following code snippet provides an example of using a SELECT statement:

SELECT OrderDateKey, SUM(SalesAmount) AS TotalSales  
FROM FactInternetSales  
GROUP BY OrderDateKey  
HAVING OrderDateKey > 20010000  
ORDER BY OrderDateKey;

We can also create Common Table Expressions (CTEs) in Synapse SQL pools. We will learn about these in the following section.

CTEs

A CTE is a temporary result set that is used to simplify complex joins and subqueries. CTEs can also be used to query hierarchical...

Creating stored procedures and views in Synapse SQL

You can create stored procedures and views in a Synapse SQL pool using SSMS, Azure Data Studio, or Synapse Studio. In this section, we are going to learn the syntax for creating stored procedures and views. We will learn more about the usage and limitations of stored procedures and views further on in this section.

Stored procedures

A stored procedure is prepared SQL code that can be saved and reused. One important thing to keep in mind is that stored procedures are not precompiled in a Synapse SQL pool. When stored procedures are executed, SQL statements are parsed, translated, and optimized at runtime. As with SQL Server, you can pass parameters to stored procedures in a SQL pool as well.

The following code block provides a simple example of how to create stored procedures in a SQL pool:

CREATE PROCEDURE Usp_samplestoredprocedure (@MinPriceCondition MONEY, 
          ...

Optimizing transactions in Synapse SQL

In simple words, a group of data modification operations is called a transaction. If all operations are successful, we can call it a successful transaction. In a successful transaction, all the modifications are committed and become a permanent part of the database; otherwise, all the data modifications will be erased.

In this section, we are going to learn how to manage transactions in Synapse SQL pools. In the case of any failure, you will need to roll back all the changes made during the execution of stored procedures in order to maintain consistency in your data. You can handle any sort of exceptions in stored procedures using a TRY-CATCH block, as illustrated in the following code snippet:

SET NOCOUNT ON;
DECLARE @xact_state smallint = 0;
BEGIN TRAN
    BEGIN TRY
        DECLARE @i INT;        
       ...

Supporting system views in a Synapse SQL pool

System views are built-in views in SQL that are used to monitor the health of a SQL pool and diagnose problems associated with the performance of a SQL pool.

Synapse SQL supports various system views, and we will be covering a few of these in this section, as follows:

  • sys.pdw_column_distribution_properties: Holds distribution information for columns.
  • sys.pdw_distributions: Holds information about the distributions on the appliance. It lists one row per appliance distribution.
  • sys.pdw_index_mappings: Maps the logical indexes to the physical name used on compute nodes, as reflected by a unique combination of the object_id of the table holding the index and the index_id of a particular index within that table.
  • sys.pdw_loader_backup_run_details: Contains information about ongoing and completed backup and restore operations in Azure Synapse Analytics (SQL Data Warehouse).
  • sys.pdw_loader_backup_runs: This is similar...

Using T-SQL queries on semi-structured and unstructured data

Azure Synapse SQL on-demand allows you to query data in your data lake. The OPENROWSET function is used in SQL on-demand to query an external data source. We will learn how to use this function for reading different types of files.

Reading Parquet files

Parquet is an open source file format that is designed for efficient, as well as performant, flat columnar storage of data. Synapse provides a feature to read Parquet files directly, using the OPENROWSET function.

The easiest way to read a Parquet file's content is to provide the file Uniform Resource Locator (URL) to the OPENROWSET function and specify the Parquet format, as illustrated in the following code snippet:

select top 10 *
from openrowset(
    bulk 'https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/ecdc_cases/latest/ecdc_cases.parquet',
    format ...

Summary

This chapter was primarily focused on Synapse SQL. We learned different T-SQL language elements that are supported in Synapse SQL, as well as their limitations. We learned how we can use T-SQL statements with structured, semi-structured, or unstructured data. In this chapter, we also covered how to manage transactions efficiently to avoid any transaction failures. We also learned that we could create stored procedures and views in Synapse SQL in a similar way to how we do this in SQL Server. Synapse SQL provides a few additional features to read data directly from a data lake.

We saw some of the system views supported in Synapse SQL. We also learned how to use sample scripts to build our logic as per the business need.

The next chapter will be more focused on Synapse Spark, where we will learn how to write code in different languages in Synapse Spark without worrying about infrastructure management. We will also learn how to use notebooks in Synapse Studio.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Limitless Analytics with Azure Synapse
Published in: Jun 2021Publisher: PacktISBN-13: 9781800205659
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Prashant Kumar Mishra

Prashant Kumar Mishra is an engineering architect at Microsoft. He has more than 10 years of professional expertise in the Microsoft data and AI segment as a developer, consultant, and architect. He has been focused on Microsoft Azure Cloud technologies for several years now and has helped various customers in their data journey. He prefers to share his knowledge with others to make the data community stronger day by day through his blogs and meetup groups.
Read more about Prashant Kumar Mishra