Reader small image

You're reading from  Limitless Analytics with Azure Synapse

Product typeBook
Published inJun 2021
Reading LevelBeginner
PublisherPackt
ISBN-139781800205659
Edition1st Edition
Languages
Concepts
Right arrow
Author (1)
Prashant Kumar Mishra
Prashant Kumar Mishra
author image
Prashant Kumar Mishra

Prashant Kumar Mishra is an engineering architect at Microsoft. He has more than 10 years of professional expertise in the Microsoft data and AI segment as a developer, consultant, and architect. He has been focused on Microsoft Azure Cloud technologies for several years now and has helped various customers in their data journey. He prefers to share his knowledge with others to make the data community stronger day by day through his blogs and meetup groups.
Read more about Prashant Kumar Mishra

Right arrow

Chapter 14: Coding Best Practices

Azure Synapse allows you to create a Structured Query Language (SQL) pool or an Apache Spark pool with just a couple of clicks, without worrying too much about maintenance and management. However, you need to follow certain best practices in order to utilize these pools effectively and efficiently.

This chapter is crucial to the production environment. When you need to create a SQL or Spark pool in your production environments, you must follow the coding or development best practices. This chapter is mainly focused on the best practices for coding, development, workload management, and cost management, for both SQL and Spark pools on Azure Synapse.

In this chapter, we will cover the following topics:

  • Implementing best practices for a Synapse dedicated SQL pool
  • Implementing best practices for a Synapse serverless SQL pool
  • Implementing best practices for a Synapse Spark pool

Technical requirements

To follow the instructions in the next sections, there are certain prerequisites before we proceed, outlined here:

  • You should have your Azure subscription, or access to any other subscription with contributor-level access.
  • Create your Synapse workspace on this subscription. You can follow the instructions from Chapter 1, Introduction to Azure Synapse, to create your Synapse workspace.
  • Create a SQL pool and a Spark pool on Azure Synapse. This has been covered in Chapter 2, Consideration of Your Compute Environments.

Implementing best practices for a Synapse dedicated SQL pool

In the previous chapters, we learned many things about Synapse dedicated SQL pools. In this section, we will only learn about the best practices to maintain your dedicated SQL pool and keep it healthy from a computational or storage point of view.

In order to get better performance, we need to have optimized code, but along with that we need to consider various other factors as well. You may have sometimes experienced that your query had been performing well until last week and then suddenly, its performance dropped drastically. So, how do you avoid such kinds of hiccups in your production environment? In the following section, we are going to learn about a couple of features or implementations to keep your query performance constantly healthy.

Maintaining statistics

Statistics play a critical role in query performance. They provide a distribution of column values to the query optimizer, and that is used by the SQL...

Implementing best practices for a Synapse serverless SQL pool

Some of the best practices discussed in the preceding section will be valid even for a serverless SQL pool; however, there are few other considerations for serverless SQL pools. We are going to learn about some of these recommendations in the following sections.

Selecting the region to create a serverless SQL pool

If you are creating a storage account while creating a Synapse workspace, then your serverless SQL pool and storage account will be created in the same region where you created your workspace. But if you are planning to access other storage accounts, make sure you are creating your workspace in the same region. If you try accessing your data in a different region, there will be some network latency in data movement, but you can avoid this by using the same region for your serverless SQL pool as for your storage account.

You need to keep in mind that once the workspace is created, you cannot change the...

Implementing best practices for a Synapse Spark pool

As with Synapse SQL pools, it is also important to keep our Spark pool healthy. In this section, we are going to learn how to optimize cluster configuration for any particular workload. We will also learn how to use various techniques for enhancing Apache Spark performance.

Configuring the Auto-pause setting

There are some major advantages of using Platform-as-a-Service (PaaS) instead of an on-premises environment, and the Auto-pause setting is one of the best features that PaaS has to offer. If you are running a Spark cluster on your on-premises environment, you need to pay for provisioning it even though you may only need to use this cluster for a couple of hours a day. However, Synapse gives you the option to configure the Auto-pause setting to pause a cluster automatically if not in use. Upon entering a value for the Number of minutes idle field within the Auto-pause setting, the Spark pool will go to a Pause state automatically...

Summary

This chapter concludes the entire book. In this chapter, we learned about implementing the best practices for Synapse SQL pools and Spark pools. We learned how we keep indexes healthy in a SQL pool such that we gain better performance, and we also learned about using PolyBase and materialized views in Synapse dedicated SQL pools for enhanced performance. This chapter also included the best file type and size to be used in the case of a Synapse serverless SQL pool. Configuring the Auto pause setting to help save costs in terms of computational power was also highlighted in this chapter. Last but not least, we learned about memory considerations and bucketing in a Spark pool.

I am thankful to you for traveling with me on this learning journey. Congratulations on reaching the finish line in this book, and I wish you all the best as you continue exploring Azure Synapse.

Hope to meet you again in my next learning journey!

Why subscribe?

  • Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals
  • Improve your learning with Skill Plans built especially for you
  • Get a free eBook or video every month
  • Fully searchable for easy access to vital information
  • Copy and paste, print, and bookmark content

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at customercare@packtpub.com for more details.

At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Limitless Analytics with Azure Synapse
Published in: Jun 2021Publisher: PacktISBN-13: 9781800205659
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Prashant Kumar Mishra

Prashant Kumar Mishra is an engineering architect at Microsoft. He has more than 10 years of professional expertise in the Microsoft data and AI segment as a developer, consultant, and architect. He has been focused on Microsoft Azure Cloud technologies for several years now and has helped various customers in their data journey. He prefers to share his knowledge with others to make the data community stronger day by day through his blogs and meetup groups.
Read more about Prashant Kumar Mishra