Reader small image

You're reading from  Scalable Data Analytics with Azure Data Explorer

Product typeBook
Published inMar 2022
Reading LevelBeginner
PublisherPackt
ISBN-139781801078542
Edition1st Edition
Languages
Concepts
Right arrow
Author (1)
Jason Myerscough
Jason Myerscough
author image
Jason Myerscough

Jason Myerscough is a director of Site Reliability Engineering and cloud architect at Nuance Communications. He has been working with Azure daily since 2015. He has migrated his company's flagship product to Azure and designed the environment to be secure and scalable across 16 different Azure regions by applying cloud best practices and governance. He is currently certified as an Azure Administrator (AZ-103) and an Azure DevOps Expert (AZ-400). He holds a first-class bachelor's degree with honors in software engineering and a first class master’s degree in computing.
Read more about Jason Myerscough

Right arrow

Chapter 2: Building Your Azure Data Explorer Environment

In the previous chapter, we introduced the data analytics pipeline, Azure Data Explorer (ADX), and executed our first Kusto Query Language (KQL) query on a publicly available demo cluster provided by Microsoft.

In this chapter, we will assume you have just created a new Azure account and begin by creating a new subscription. Once we have a subscription, we can start creating Azure resources, such as Azure Cloud Shell and ADX instances.

Next, we will introduce you to Cloud Shell, provision our first Cloud Shell instance, and discover one of the least known but extremely useful features, the lightweight code editor.

Then, we will create our first ADX clusters and database via the Azure portal, introduce the concept of Infrastructure as Code (IaC) and discuss some of the benefits and why IaC should be your preferred method for managing infrastructure on Azure.

Next, we will use Cloud Shell to recreate our ADX clusters...

Technical requirements

If you do not already have an Azure account, head over to https://azure.microsoft.com/en-us/free/search/ and sign up. Microsoft provides 12 months of popular free services and $200 credit at the time of writing.

The code examples for this chapter can be found in the Chapter02 folder of our repo: https://github.com/PacktPublishing/Scalable-Data-Analytics-with-Azure-Data-Explorer.git.

For the PowerShell examples, we are going to use Azure Cloud Shell rather than installing and configuring the tools locally on our machines.

Table 2.1 shows the PowerShell cmdlet versions used for the PowerShell examples:

Table 2.1 – PowerShell cmdlet version info

All the code examples, both ARM templates and PowerShell, have been tested on Azure Cloud Shell, macOS, and Windows 10.

Creating an Azure subscription

Once you have created your Azure account and logged in, you will see the Welcome to Azure page. You will be informed that you do not have a subscription.

There are different types of subscriptions, each with different constraints and conditions, such as free credit and Service Level Agreements (SLAs) for Azure services. For our purpose, we will use the Free Trial. If your trial expires, then you can upgrade the subscription to Pay-As-You-Go. For a complete list of Azure subscriptions, please see https://azure.microsoft.com/en-us/support/legal/offer-details/.

Before you can create any resources in Azure, you need to create a subscription. Subscriptions in Azure are logical containers, primarily used for billing. Any costs you incur will be assigned to your subscription. It is possible to have multiple subscriptions, but we only need one for our purposes.

As shown in Figure 2.1, the default behavior is to collapse the Azure portal menu. To view...

Introducing Azure Cloud Shell

One of the most convenient services, which I use daily, is Azure Cloud Shell. Azure Cloud Shell is an interactive command shell that supports Bash and PowerShell. Rather than installing the PowerShell Az module locally, we will use Azure Cloud Shell for the examples in this book. Azure Cloud Shell also includes a lightweight code editor that we will see later when we examine the code examples for this chapter.

Azure Cloud Shell requires a storage account, which can be created the first time you try to open Cloud Shell. Azure Cloud Shell can be accessed via https://shell.azure.com or by clicking the Azure Cloud Shell icon on the Azure portal's header menu (see Figure 2.6).

Figure 2.6 – Cloud Shell icon

Figure 2.6 – Cloud Shell icon

The first time you open Azure Cloud Shell, you will be prompted to select either Bash or PowerShell as your default shell (see Figure 2.7). Select PowerShell. You can switch shells at any time.

...

Creating and configuring ADX instances in the Azure portal

As mentioned in the previous chapter, one of the benefits of Platform as a Service (PaaS) solutions is that they abstract a lot of the operational work away from you, allowing you to focus more on using the product rather than operating it. At a high level, there are four steps to get up and running:

  1. Creating a cluster.
  2. Creating a database.
  3. Ingesting data.
  4. Querying your data.

To create a cluster, you need to complete the following steps:

  1. In the portal menu, click All services and search for data explorer, as shown in Figure 2.10.

Figure 2.10 – Azure Data Explorer Clusters

  1. Click Azure Data Explorer Clusters to display the ADX blade. Create a new cluster by clicking + Create, as shown in Figure 2.11.
Figure 2.11 – Create a new cluster

Figure 2.11 – Create a new cluster

The Create an Azure Data Explorer Cluster blade displays a wizard with different...

Introducing Infrastructure as Code

The Azure portal is a good place to start learning about different Azure resources. The simple user interface and rich documentation enables us to get up and running quickly. One of the disadvantages of using the portal is the deployment process is not consistently repeatable. Every time you want to deploy a resource, you must step through the wizard, which increases the chances of human error, such as specifying the incorrect location or even the incorrect subscription if you manage multiple subscriptions.

The preferred method for deploying and managing infrastructure is called Infrastructure as Code (IaC). IaC allows us to declare our infrastructure as code, giving us all the benefits of software development, for example, CI/CD, source control, code reviews, versioning, and so on. Since our infrastructure is in code, we can safely and reliably deploy our infrastructure consistently by deploying our code. This ability to deploy your infrastructure...

Creating and configuring ADX instances with PowerShell

As mentioned earlier, we will use Azure Cloud Shell for this exercise, so if you have been following along, you should have provisioned Azure Cloud Shell. If you have not, please be sure to read the earlier section, Introducing Azure Cloud Shell.

Let's begin by cloning the repository in Azure Cloud Shell:

  1. Open a new browser tab and go to https://shell.azure.com. The URL takes you directly to Cloud Shell.
  2. Ensure your current shell is PowerShell. You will see the name of your current shell in the top-left corner, as shown in Figure 2.24:

Figure 2.24 – Cloud Shell

  1. Create a new directory called development, by typing mkdir development, and navigate into the directory: cd development.
  2. Clone our Git repository: git clone https://github.com/PacktPublishing/Scalable-Data-Analytics-with-Azure-Data-Explorer.git.
  3. Navigate into the Scalable-Data-Analytics-with-Azure-Data...

Creating ADX clusters with ARM templates

ARM templates are declarative JavaScript Object Notation (JSON) files that we use to define our infrastructure and configuration requirements. There is a lot of debate surrounding the ease of readability of JSON and at the time of writing, Microsoft has a new tool in preview called Bicep, which is similar to Terraform's propriety HashiCorp Configuration Language (HCL). We are not going to compare ARM with Bicep or Terraform; each tool has a purpose and what you choose ultimately depends on your requirements.

It is not possible to cover all aspects of ARM templates in this short chapter, so we will cover the basics to get started.

ARM template structure

As shown in the following code snippet, ARM templates consist of six sections. There is a seventh section called functions, which is rarely used, and I will not cover it here. I have only used the functions section once:

{
    "$schema": "https...

Summary

Well done! You made it this far. We covered a lot of topics, and you should be proud of what we accomplished. We started by creating our first subscription and activated our free trial, which is valid for 30 days at the time of writing. Then we learned about Cloud Shell, which is a web-based console that provides PowerShell and Bash terminals directly in the Azure portal. We also learned about the lightweight code editor embedded in Cloud Shell, which is a very convenient feature. We then provisioned our own Cloud Shell instance, allowing us to create ADX clusters and databases via PowerShell and ARM templates.

Then, we created our first ADX cluster and database using the Azure portal, then deleted it and learned about the benefits of Infrastructure as Code and introduced the declarative and imperative programming paradigms.

Then, we learned how to create ADX clusters and databases using PowerShell cmdlets and had a quick introduction to ARM templates, where we learned...

Questions

Before moving on to the next chapter, test your knowledge by trying these exercises. The answers can be found at the back of the book:

  1. Modify adx-powershell.ps1 and try to deploy another cluster with doubleEncryption enabled. Hint: New-AzKustoCluster has an optional parameter called -EnableDoubleEncryption.
  2. Create a new parameter file and enable purging and doubleEncryption, then change the hot cache period to 10 days and the soft delete period to 100 days.
  3. In the Azure portal, create a second ADX database and set the hot cache to 10 days and soft delete to 50 days.
  4. Modify adx-powershell.ps1 and deploy your ADX cluster to an Azure region that is close to you.
  5. What is the difference between the hot cache and soft delete retention periods?
  6. What shells are supported by Cloud Shell?
  7. How do you open the code editor in Cloud Shell?
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Scalable Data Analytics with Azure Data Explorer
Published in: Mar 2022Publisher: PacktISBN-13: 9781801078542
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Jason Myerscough

Jason Myerscough is a director of Site Reliability Engineering and cloud architect at Nuance Communications. He has been working with Azure daily since 2015. He has migrated his company's flagship product to Azure and designed the environment to be secure and scalable across 16 different Azure regions by applying cloud best practices and governance. He is currently certified as an Azure Administrator (AZ-103) and an Azure DevOps Expert (AZ-400). He holds a first-class bachelor's degree with honors in software engineering and a first class master’s degree in computing.
Read more about Jason Myerscough