Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
The Self-Taught Cloud Computing Engineer

You're reading from  The Self-Taught Cloud Computing Engineer

Product type Book
Published in Sep 2023
Publisher Packt
ISBN-13 9781805123705
Pages 472 pages
Edition 1st Edition
Languages
Author (1):
Dr. Logan Song Dr. Logan Song
Profile icon Dr. Logan Song

Table of Contents (24) Chapters

Preface 1. Part 1: Learning about the Amazon Cloud
2. Chapter 1: Amazon EC2 and Compute Services 3. Chapter 2: Amazon Cloud Storage Services 4. Chapter 3: Amazon Networking Services 5. Chapter 4: Amazon Database Services 6. Chapter 5: Amazon Data Analytics Services 7. Chapter 6: Amazon Machine Learning Services 8. Chapter 7: Amazon Cloud Security Services 9. Part 2:Comprehending GCP Cloud Services
10. Chapter 8: Google Cloud Foundation Services 11. Chapter 9: Google Cloud’s Database and Big Data Services 12. Chapter 10: Google Cloud AI Services 13. Chapter 11: Google Cloud Security Services 14. Part 3:Mastering Azure Cloud Services
15. Chapter 12: Microsoft Azure Cloud Foundation Services 16. Chapter 13: Azure Cloud Database and Big Data Services 17. Chapter 14: Azure Cloud AI Services 18. Chapter 15: Azure Cloud Security Services 19. Part 4:Developing a Successful Cloud Career
20. Chapter 16: Achieving Cloud Certifications 21. Chapter 17: Building a Successful Cloud Computing Career 22. Index 23. Other Books You May Enjoy

Azure Cloud Database and Big Data Services

In the first part of the book, we discussed the AWS database and big data services. In the second part of the book, we covered the Google database and big data services. Coming to the third part of the book, after discussing Microsoft Azure’s foundational cloud services in the last chapter, we will now focus on the Azure database and big data services, which are like AWS and Google data services but with their own features.

Like Amazon and Google, Microsoft provides many solid data storage and analytics services in its Azure cloud platform. In this chapter, we will cover the following topics:

  • Azure Cloud Data Storage explores some basic concepts about Azure storage accounts and Azure Data Lake Storage
  • Azure Database Services examines Azure database services such as Azure SQL Database, Azure NoSQL database solutions including Azure Table Storage and Cosmos DB, Azure data warehouses and Azure Synapse Analytics
  • Azure...

Azure cloud data storage

During the launch of Azure Cloud Shell in the previous chapter, you may have noticed that we created an Azure storage account before the Azure Cloud Shell launch. An Azure storage account provides unique storage space for your Azure cloud data, accessible from anywhere in the world over HTTP or HTTPS. When an Azure storage account is created, the following Azure storage data objects are created: blobs, files, queues, and tables, in an all-in-one fashion:

  • Azure blobs are blob storage, which is an object storage like AWS S3 or Google GCS
  • Azure files permit you to manage file-sharing in the cloud – shareable to Azure VMs and on-prem VMs
  • Azure queue storage is a cloud service similar to AWS Simple Queue Service (SQS),for storing large numbers of messages
  • Azure table storage stores structured NoSQL cloud data, with a key/attribute store and a schema-less design

An Azure storage account provides all these storage and data services...

Azure cloud databases

While an Azure data lake stores raw data, an Azure database usually stores formatted data. Azure offers cloud database services categorized into relational databases and NoSQL databases.

Azure cloud relational databases

Like AWS and GCP, Azure offers three options for cloud relational database deployment and usage:

  • Azure SQL virtual machines: SQL Server built on Azure virtual machines. This is an Infrastructure-as-a-Service (IaaS) cloud service and thus you will have control of the database edition, version, and size. You will also be fully responsible for managing the virtual machine, including patching and other configuration management. More information is available at https://azure.microsoft.com/en-us/products/virtual-machines/sql-server.
  • Azure SQL managed instances: This is the best for migrating on-premises SQL databases to the cloud. It serves the purpose of migrating many apps from on-premises to a fully managed Platform-as-a-Service...

Azure cloud big data services

Like Amazon and Google, Microsoft provides a full stack of big data cloud services, including the big data ETL service, ADF; the big data processing service tool, Azure HDInsight; and the big data analytic service, Azure Data Bricks.

Azure ADF

ADF is a cloud-based data integration service for creating data-driven workflows that automatically move and transform data. ADF is a pipeline – a logical grouping of activities to perform a data-driven task, such as the following:

  • Data moving – This takes an ingestion source, pulls it into the Azure cloud, and puts it into a data lake
  • Data transformation – This connects the data lake to Databricks, runs a stored procedure, and transforms data to produce a new dataset for further analytics

Essentially, ADF is a data ETL service integrating hybrid data at an enterprise level. More details about ADF are available at https://azure.microsoft.com/en-us/products/data-factory...

Summary

In this chapter, we learned about the Azure cloud database and big data services. We explored Azure Data Lake Storage, Azure cloud databases and Azure Synapse Analytics, Azure data ETL tools such as ADF, data processing tools such as HDInsight, and Azure data analytics tools such as Databricks. By the end of this chapter, you will have acquired knowledge on data ingestion, storing, processing, and visualization in the Azure cloud.

In the next chapter, we will examine the machine learning services in Azure’s cloud.

Practice questions

Questions 1-3 are based on the following.

The data team for company ABC is building an Azure cloud data analytics platform, with the following objectives:

  • The team has two data scientists who are familiar with R, Scala, and Python, and two data engineers who are good at Python. Each team member needs a cluster.
  • The team needs to run notebooks that use Python, Scala, and SQL for their job workloads.
  • The team needs to optimize their work performance.

1. What platform fits a data scientist?

A. A High Concurrency Databricks cluster

B. A standard Databricks cluster

C. An AFD pipeline

D. The Azure Synapse platform

2. What platform fits a data engineer?

A. A High Concurrency Databricks cluster

B. A standard Databricks cluster

C. An AFD pipeline

D. The Azure Synapse platform

3. What platform fits the job workload?

A. A High Concurrency Databricks cluster

B. A standard Databricks cluster

C. An AFD pipeline

...

Answers to the practice questions

1. B

2. A

3. B

4. A

5. A

6. A

7. C

8. C

lock icon The rest of the chapter is locked
You have been reading a chapter from
The Self-Taught Cloud Computing Engineer
Published in: Sep 2023 Publisher: Packt ISBN-13: 9781805123705
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}