You're reading from Azure Databricks Cookbook

Product typeBook

Published inSep 2021

PublisherPackt

ISBN-139781789809718

Edition1st Edition

Concepts

Data Streaming

Authors (2):

Phani Raj

Vinod Jaiswal

View More author details

Chapter 8: Databricks SQL

Databricks SQL provides a great experience for SQL developers, BI developers, analysts, and data scientists to run ad hoc queries on large volumes of data in a data lake, creating various visualizations and rich dashboards.

Databricks SQL provides the following features:

A fully managed SQL endpoint for running all SQL queries
A query editor for writing SQL queries
Visualizations and dashboards for providing various insights into the data
Integration with Azure Active Directory, providing enterprise-level security for data by controlling the access to tables using role-based access controls
Integration with Power BI for creating rich visualizations and sharing meaningful insights from the data in a data lake
The ability to create alerts on a field returned by a query on meeting a threshold value and notifying users

By the end of this chapter, you will have learned how you can use Databricks SQL to write ad hoc queries...

Technical requirements

To follow along with the examples shown in the recipes, you will need to have the following:

An Azure subscription and the required permissions on the subscription that was mentioned in the Technical requirements section in Chapter 1, Creating an Azure Databricks Service.
An Azure Databricks premium workspace with a Spark 3.x cluster.
Databricks SQL is in public preview.
Important Note
The UI and some features of Databricks SQL may change in the future when it goes to General Availability (GA) as it is still in public preview.

In the next section, you will learn how to create a user in Databricks SQL.

How to create a user in Databricks SQL

In this recipe, you will learn how to create a user for running queries, creating dashboards, or performing data analysis on top of the data available in Delta Lake.

Getting ready

Before starting with this recipe, you need to ensure that you have the resources mentioned in the Technical requirements section of this chapter.

How to do it…

Let's go through the steps for creating a Databricks SQL user:

Open your Databricks workspace with Databricks SQL access:
Figure 8.1 – Databricks SQL workspace
Click on Admin Console and go to the Users tab to add a user:
Figure 8.2 – Databricks workspace Admin Console
Click on the Add User button and enter the email ID of the user whom you want to add. You can only add users who belong to the Azure Active Directory tenant of your Azure Databricks workspace. Click on OK to add the user to the Databricks workspace:
Figure 8.3 – Add a Databricks SQL user...

Creating SQL endpoints

SQL endpoints are computation resources using which you can run SQL queries on data objects in Azure Databricks environments. SQL endpoints are fully managed SQL-optimized compute clusters that autoscale based on user load to support a better experience for user concurrency and performance. They are computing clusters, very similar to clusters that we have already used in the Azure Databricks environment.

Getting ready

Let's go through the permissions required for creating and managing SQL endpoints in the Azure Databricks workspace.

The user must have the Allow cluster creation permission in the Azure Databricks workspace. You can check this permission from Admin Console | the Users tab or the Groups tab depending on whether access needs to be granted to an individual user or group. Refer to the following screenshot from the workspace:

Figure 8.5 – Databricks SQL cluster creation

After permission is granted...

Granting access to objects to the user

In this recipe, you will learn how to grant access to objects to users so that they can write queries for analyzing data. We have already seen how users are created in the How to create a user in Databricks SQL recipe of this chapter.

Getting ready

Before you start on this recipe, make sure you have gone through the How to create a user in Databricks SQL recipe in this chapter and that all the resources mentioned in the Technical requirements section are ready for usage.

Also, run the following notebook to create Customer and Orders external Delta tables:

https://github.com/PacktPublishing/Azure-Databricks-Cookbook/blob/main/Chapter06/6_1.Reading%20Writing%20to%20Delta%20Tables.ipynb.

How to do it…

Let's run through the following commands to grant access to a user:

Execute the following command to grant access to a user or principal to the default database:
```
GRANT USAGE ON DATABASE default TO `user@xyz.com...
```

Running SQL queries in Databricks SQL

SQL queries in Databricks SQL allows BI users or data analysts to create and run ad hoc SQL queries on data in a data lake and schedule the queries to run at regular intervals. BI users or analysts can create reports based on business requirements and it's easy for traditional BI users to be onboarded to Databricks SQL to write SQL queries and get a similar experience they are used to in on-prem databases.

Getting ready

Before starting, execute the following notebooks:

https://github.com/PacktPublishing/Azure-Databricks-Cookbook/blob/main/Chapter06/6_1.Reading%20Writing%20to%20Delta%20Tables.ipynb.
Running the preceding notebook will create Customer and Orders external Delta tables.
https://github.com/PacktPublishing/Azure-Databricks-Cookbook/blob/main/Chapter07/7.1-End-to-End%20Data%20Pipeline.ipynb
Running the preceding notebook will create the VehicleSensor-related Delta tables.

You need to create the SQL endpoints...

Using query parameters and filters

Query parameters and filters are ways to filter the data that is returned to the end user. A query parameter will substitute the values in a query at runtime before getting executed, whereas a query filter will limit the data after it has been loaded into the browser. Query filters should be used only for small datasets and not for large volumes of data since it does not filter the data at runtime. In this recipe, you will learn how to use query filters and parameters in SQL queries.

Getting ready

Before starting, you need to ensure you execute the following notebooks:

https://github.com/PacktPublishing/Azure-Databricks-Cookbook/blob/main/Chapter06/6_1.Reading%20Writing%20to%20Delta%20Tables.ipynb
Running the preceding notebook will create Customer and Orders external Delta tables.
https://github.com/PacktPublishing/Azure-Databricks-Cookbook/blob/main/Chapter07/7.1-End-to-End%20Data%20Pipeline.ipynb
Running the preceding notebooks...

Introduction to visualizations in Databricks SQL

In this recipe, we will learn how to create different visualizations in Databricks SQL queries and how to change certain properties of visualizations.

Getting ready

Before starting, you need to ensure you execute the following notebook so that the tables used in the queries are created:

Running the preceding notebooks will create Customer, Orders, and VehicleSensor related Delta tables. Ensure you complete the Running SQL queries in Databricks SQL and Using query parameters and filters recipes as you will be using the queries created in those recipes.

How to do it…

In this section, you will learn how to create visualizations by running through the following...

Creating dashboards in Databricks SQL

Azure Databricks Databricks SQL allows users to create various dashboards based on the queries and visualizations that are already built. This helps businesses to get a visual representation of data for various KPIs.

In this recipe, you will learn how to create visualizations in Databricks SQL and how to pin a visualization from various queries to the dashboard.

Getting ready

Before starting, we need to ensure we have executed the following notebook. The following notebook creates the required tables on which we can build our visualizations and dashboard:

https://github.com/PacktPublishing/Azure-Databricks-Cookbook/blob/main/Chapter07/7.1-End-to-End%20Data%20Pipeline.ipynb

Also create the visualization as mentioned in the Introduction to visualizations in Databricks SQL recipe.

How to do it…

Let's learn how to create a dashboard by running through the following steps:

From the Databricks SQL homepage go...

Connecting Power BI to Databricks SQL

Databricks SQL provides built-in connectors for Power BI users to connect to objects in Databricks SQL. Power BI users can now connect to a SQL endpoint to get a list of all tables and can connect using the import or direct query mode.

In this recipe, you will learn how to connect to a SQL endpoint from Power BI.

Getting ready

Before starting the recipe, ensure you have executed both of the following notebooks. The following notebook creates the required tables on which we can build our visualizations and dashboard. You need to ensure you download the latest version of Power BI Desktop:

You can go through the following link which has the requirements for connecting Power...

The rest of the chapter is locked

You have been reading a chapter from

Azure Databricks Cookbook

Published in: Sep 2021Publisher: PacktISBN-13: 9781789809718

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (2)

Phani Raj

Phani Raj is an experienced data architect and a product manager having 15 years of experience working with customers on building data platforms on both on-prem and on cloud. Worked on designing and implementing large scale big data solutions for customers on different verticals. His passion for continuous learning and adapting to the dynamic nature of technology underscores his role as a trusted advisor in the realm of data architecture ,data science and product management.
Read more about Phani Raj

Vinod Jaiswal

Vinod Jaiswal is an experienced data engineer, excels in transforming raw data into valuable insights. With over 8 years in Databricks, he designs and implements data pipelines, optimizes workflows, and crafts scalable solutions for intricate data challenges. Collaborating seamlessly with diverse teams, Vinod empowers them with tools and expertise to leverage data effectively. His dedication to staying updated on the latest data engineering trends ensures cutting-edge, robust solutions. Apart from technical prowess, Vinod is a proficient educator. Through presentations and mentoring, he shares his expertise, enabling others to harness the power of data within the Databricks ecosystem.
Read more about Vinod Jaiswal

Other recommended products

Related to this chapter

Azure Data Engineering Cookbook

This book will help you design and implement modern ETL workflows along with data management, monitoring, and security aspects to meet the current organization's needs. You will use various services such as Azure Data Factory, Azure Databricks, Azure Stream Analytics, and Azure Data Explorer to design efficient data processing solutions.

BookApr 2021454 pages

Distributed Data Systems with Azure Databricks

This book helps you to learn how to extract, transform, and orchestrate massive amounts of data to develop robust data pipelines. You'll perform complex machine learning tasks using advanced Azure Databricks features, and also explore model tuning, deployment, and control using Databricks functionalities such as AutoML and Delta Lake with TensorFlow.

BookMay 2021414 pages

Azure Data Factory Cookbook

With the help of well-structured and practical recipes, this book will teach you how to integrate data from the cloud and on-premise. You’ll learn how to transform, clean, and consolidate data into a single data platform and get to grips with using ADF as the main ETL and orchestration tool for your data warehouse or data platform project.

BookDec 2020382 pages

Cloud Analytics with Microsoft Azure

Cloud Analytics with Microsoft Azure is an end-to-end guide to processing and analyzing big data using a range of Microsoft Azure features. This book covers everything you need to build your own data warehouse and learn numerous techniques to gain useful insights by analyzing big data.

BookNov 2019242 pages

Limitless Analytics with Azure Synapse

This book helps you understand the basic concepts and techniques of using Azure Synapse step-by-step. You'll gradually gain the skills you need to work with data and develop analytics solutions using the Azure analytics platform even with no prior knowledge of Azure.

BookJun 2021392 pages

Cloud Scale Analytics with Azure Data Services

This book will help you to understand the architectural components of a modern data warehouse and select those suitable for your requirements. You’ll learn everything from how to integrate your source data into Azure Data Lake at scale to how to structure your analytical data estate and more.

BookJul 2021520 pages

ETL with Azure Cookbook

This book will take you through hand-on recipes for extracting, transforming, and loading data using big data tools and Azure services such as Data Factory and Azure Databricks. You will learn how to interact effectively with Azure services, along with covering automation with BIML and data profiling in Azure.

BookSep 2020446 pages

Stream Analytics with Microsoft Azure

This book is your guide to understanding the basics of how Azure Stream Analytics works, and build your own analytics solution using its capabilities. By the end of this book, you will be well-versed in using Azure Stream Analytics to develop an efficient analytics solution which can work with any type of data.

BookDec 2017322 pages

Data Modeling for Azure Data Services

Data modeling for Azure Data Services teaches you the core concepts of setting up different types of databases for different use cases. With this hands-on guide, you'll learn how to implement the resulting data model in Azure efficiently.

BookJul 2021428 pages

Exam Ref AZ-304 Microsoft Azure Architect Design Certification and Beyond

If you're taking the AZ-304 Microsoft Azure Architect Design exam, you need to know which Azure technologies to choose and when. Exam Ref AZ-304 Microsoft Azure Architect Design Certification and Beyond prepares you for the AZ-304 exam and shows you how to design scalable and secure solutions using compute, storage, data, monitoring, and logging.

BookJul 2021520 pages

Azure for Architects

Azure cloud services have risen rapidly and there is also a gradual increase in the number of organizations that adopt Azure for their cloud services. This 3rd edition will assist readers to create a comprehensive Azure cloud solution that is Enterprise-class and ready for the future.

BookJul 2020698 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages