You're reading from Limitless Analytics with Azure Synapse

Product type Book

Published in Jun 2021

Publisher Packt

ISBN-13 9781800205659

Pages 392 pages

Edition 1st Edition

Languages

Python

Concepts

Data Science

Author (1):

Prashant Kumar Mishra

Table of Contents (20) Chapters

Preface

Section 1: The Basics and Key Concepts

Chapter 1: Introduction to Azure Synapse

Chapter 2: Considerations for Your Compute Environment

Section 2: Data Ingestion and Orchestration

Chapter 3: Bringing Your Data to Azure Synapse

Chapter 4: Using Synapse Pipelines to Orchestrate Your Data

Chapter 5: Using Synapse Link with Azure Cosmos DB

Section 3: Azure Synapse for Data Scientists and Business Analysts

Chapter 6: Working with T-SQL in Azure Synapse

Chapter 7: Working with R, Python, Scala, .NET, and Spark SQL in Azure Synapse

Chapter 8: Integrating a Power BI Workspace with Azure Synapse

Chapter 9: Perform Real-Time Analytics on Streaming Data

Chapter 10: Generate Powerful Insights on Azure Synapse Using Azure ML

Section 4: Best Practices

Chapter 11: Performing Backup and Restore in Azure Synapse Analytics

Chapter 12: Securing Data on Azure Synapse

Chapter 13: Managing and Monitoring Synapse Workloads

Chapter 14: Coding Best Practices

Other Books You May Enjoy

Understanding Spark pool

Apache Spark is a very fast unified analytics engine for big data and machine learning.

Synapse Spark Pool is one of Microsoft's implementations of Apache Spark in Azure. Synapse Analytics workspace has a Spark engine built in, along with Notebook support. Because Synapse Spark supports C#, we can write Spark .NET directly within notebooks. You can also write your code in Python, Scala, C#, and SQL.

One Spark pool can be accessed by multiple users, but for every user, one new Spark instance will be created. A Spark instance is also dependent on the Spark pool capacity: if there is enough capacity in the pool to run multiple queries, the existing instance will be able to process the job; otherwise, a new instance will be created to process the job.

The following diagram displays different components of Apache Spark on Azure Synapse: