Reader small image

You're reading from  Essential PySpark for Scalable Data Analytics

Product typeBook
Published inOct 2021
Reading LevelBeginner
PublisherPackt
ISBN-139781800568877
Edition1st Edition
Languages
Tools
Concepts
Right arrow
Author (1)
Sreeram Nudurupati
Sreeram Nudurupati
author image
Sreeram Nudurupati

Sreeram Nudurupati is a data analytics professional with years of experience in designing and optimizing data analytics pipelines at scale. He has a history of helping enterprises, as well as digital natives, build optimized analytics pipelines by using the knowledge of the organization, infrastructure environment, and current technologies.
Read more about Sreeram Nudurupati

Right arrow

Spark SQL language reference

Being a part of the overarching Hadoop ecosystem, Spark has traditionally been Hive-compliant. While the Hive query language diverges greatly from ANSI SQL standards, Spark 3.0 Spark SQL can be made ANSI SQL-compliant using a spark.sql.ansi.enabled configuration. With this configuration enabled, Spark SQL uses an ANSI SQL-compliant dialect instead of a Hive dialect.

Even with ANSI SQL compliance enabled, Spark SQL may not entirely conform to ANSI SQL dialect, and in this section, we will explore some of the prominent DDL and DML syntax of Spark SQL.

Spark SQL DDL

The syntax for creating a database and a table using Spark SQL is presented as follows:

CREATE DATABASE IF NOT EXISTS feature_store;
CREATE TABLE IF NOT EXISTS feature_store.retail_features
USING DELTA
LOCATION '/FileStore/shared_uploads/delta/retail_features.delta';

In the previous code block, we do the following:

  • First, we create a database if it doesn't...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Essential PySpark for Scalable Data Analytics
Published in: Oct 2021Publisher: PacktISBN-13: 9781800568877

Author (1)

author image
Sreeram Nudurupati

Sreeram Nudurupati is a data analytics professional with years of experience in designing and optimizing data analytics pipelines at scale. He has a history of helping enterprises, as well as digital natives, build optimized analytics pipelines by using the knowledge of the organization, infrastructure environment, and current technologies.
Read more about Sreeram Nudurupati