You're reading from The Self-Taught Cloud Computing Engineer

Product type Book

Published in Sep 2023

Publisher Packt

ISBN-13 9781805123705

Pages 472 pages

Edition 1st Edition

Languages

Concepts

Cloud Computing

Author (1):

Dr. Logan Song

Table of Contents (24) Chapters

Preface

1. Part 1: Learning about the Amazon Cloud

2. Chapter 1: Amazon EC2 and Compute Services

3. Chapter 2: Amazon Cloud Storage Services

4. Chapter 3: Amazon Networking Services

5. Chapter 4: Amazon Database Services

6. Chapter 5: Amazon Data Analytics Services

7. Chapter 6: Amazon Machine Learning Services

8. Chapter 7: Amazon Cloud Security Services

9. Part 2:Comprehending GCP Cloud Services

10. Chapter 8: Google Cloud Foundation Services

11. Chapter 9: Google Cloud’s Database and Big Data Services

12. Chapter 10: Google Cloud AI Services

13. Chapter 11: Google Cloud Security Services

14. Part 3:Mastering Azure Cloud Services

15. Chapter 12: Microsoft Azure Cloud Foundation Services

16. Chapter 13: Azure Cloud Database and Big Data Services

17. Chapter 14: Azure Cloud AI Services

18. Chapter 15: Azure Cloud Security Services

19. Part 4:Developing a Successful Cloud Career

20. Chapter 16: Achieving Cloud Certifications

21. Chapter 17: Building a Successful Cloud Computing Career

22. Index

Why subscribe?

23. Other Books You May Enjoy

Azure Cloud Database and Big Data Services

In the first part of the book, we discussed the AWS database and big data services. In the second part of the book, we covered the Google database and big data services. Coming to the third part of the book, after discussing Microsoft Azure’s foundational cloud services in the last chapter, we will now focus on the Azure database and big data services, which are like AWS and Google data services but with their own features.

Like Amazon and Google, Microsoft provides many solid data storage and analytics services in its Azure cloud platform. In this chapter, we will cover the following topics:

Azure Cloud Data Storage explores some basic concepts about Azure storage accounts and Azure Data Lake Storage
Azure Database Services examines Azure database services such as Azure SQL Database, Azure NoSQL database solutions including Azure Table Storage and Cosmos DB, Azure data warehouses and Azure Synapse Analytics
Azure...

Azure cloud data storage

During the launch of Azure Cloud Shell in the previous chapter, you may have noticed that we created an Azure storage account before the Azure Cloud Shell launch. An Azure storage account provides unique storage space for your Azure cloud data, accessible from anywhere in the world over HTTP or HTTPS. When an Azure storage account is created, the following Azure storage data objects are created: blobs, files, queues, and tables, in an all-in-one fashion:

Azure blobs are blob storage, which is an object storage like AWS S3 or Google GCS
Azure files permit you to manage file-sharing in the cloud – shareable to Azure VMs and on-prem VMs
Azure queue storage is a cloud service similar to AWS Simple Queue Service (SQS),for storing large numbers of messages
Azure table storage stores structured NoSQL cloud data, with a key/attribute store and a schema-less design

An Azure storage account provides all these storage and data services...

Azure cloud databases

While an Azure data lake stores raw data, an Azure database usually stores formatted data. Azure offers cloud database services categorized into relational databases and NoSQL databases.

Azure cloud relational databases

Like AWS and GCP, Azure offers three options for cloud relational database deployment and usage:

Azure SQL virtual machines: SQL Server built on Azure virtual machines. This is an Infrastructure-as-a-Service (IaaS) cloud service and thus you will have control of the database edition, version, and size. You will also be fully responsible for managing the virtual machine, including patching and other configuration management. More information is available at https://azure.microsoft.com/en-us/products/virtual-machines/sql-server.
Azure SQL managed instances: This is the best for migrating on-premises SQL databases to the cloud. It serves the purpose of migrating many apps from on-premises to a fully managed Platform-as-a-Service...

Azure cloud big data services

Like Amazon and Google, Microsoft provides a full stack of big data cloud services, including the big data ETL service, ADF; the big data processing service tool, Azure HDInsight; and the big data analytic service, Azure Data Bricks.

Azure ADF

ADF is a cloud-based data integration service for creating data-driven workflows that automatically move and transform data. ADF is a pipeline – a logical grouping of activities to perform a data-driven task, such as the following:

Data moving – This takes an ingestion source, pulls it into the Azure cloud, and puts it into a data lake
Data transformation – This connects the data lake to Databricks, runs a stored procedure, and transforms data to produce a new dataset for further analytics

Essentially, ADF is a data ETL service integrating hybrid data at an enterprise level. More details about ADF are available at https://azure.microsoft.com/en-us/products/data-factory...

Summary

In this chapter, we learned about the Azure cloud database and big data services. We explored Azure Data Lake Storage, Azure cloud databases and Azure Synapse Analytics, Azure data ETL tools such as ADF, data processing tools such as HDInsight, and Azure data analytics tools such as Databricks. By the end of this chapter, you will have acquired knowledge on data ingestion, storing, processing, and visualization in the Azure cloud.

In the next chapter, we will examine the machine learning services in Azure’s cloud.

Practice questions

Questions 1-3 are based on the following.

The data team for company ABC is building an Azure cloud data analytics platform, with the following objectives:

The team has two data scientists who are familiar with R, Scala, and Python, and two data engineers who are good at Python. Each team member needs a cluster.
The team needs to run notebooks that use Python, Scala, and SQL for their job workloads.
The team needs to optimize their work performance.

1. What platform fits a data scientist?

A. A High Concurrency Databricks cluster

B. A standard Databricks cluster

C. An AFD pipeline

D. The Azure Synapse platform

2. What platform fits a data engineer?

A. A High Concurrency Databricks cluster

B. A standard Databricks cluster

C. An AFD pipeline

D. The Azure Synapse platform

3. What platform fits the job workload?

A. A High Concurrency Databricks cluster