You're reading from Microsoft Azure Fundamentals Certification and Beyond - Second Edition

Product typeBook

Published inJan 2024

PublisherPackt

ISBN-139781837630592

Edition2nd Edition

Concepts

Cloud Computing

Author (1)

Steve Miles

Cloud Computing Operations Model

Cloud computing is “elastic,” “scalable,” “agile,” “fault-tolerant,” highly “available,” and helps with “disaster recovery.” These operational model characteristics in cloud computing add value and benefit an organization’s operational model.

These inherent and defining characteristics allow a workload deployed into a cloud computing environment to become highly available and scale in and out (both vertically and horizontally), which maps closely to demand. This ability to be elastic in nature allows the agility to provide a highly effective operations and economics model to flex with the changing demands of a business.

By optimizing running hours and right-sizing resources in line with demand and changing requirements, switching to a consumption-based system of paying as you use resources allows monitored spending without the over-commitment of a traditional computing cost model.

Figure 2.3 outlines the computing resources demand model and shows the implications of actual demand against implemented resources based on predicted demand:

Figure 2.3: A graph, with a Y-axis of Created resources and an X-axis of Time, plotting implemented resources against demand

Figure 2.3 – Cloud computing resource demand model

You can also see the traditional computing mindset from the last section in Figure 2.4. This traditional computing mindset means over-provisioning resources to meet predicted demand, leaving many resources underutilized. When actual demand exceeds the predicted demand, no resources are available as there is no burst capacity or scale to meet the demand. To compound things, this demand has dropped off by the time these extra resources have been implemented and are no longer needed.

With the cloud computing mindset, resource utilization can be tracked and right-sized to demand. So, there is never a case of over-provisioning and paying for more resources than are needed.

With this knowledge of the cloud computing operations model, you will look more closely at the characteristics of cloud computing that deliver benefits and value over the traditional computing model.

Operational Benefits of Cloud Computing

This section will look at the operational benefits cloud computing can add to an organization compared to those provided in a traditional computing model. Cloud computing platforms primarily provide the following operational benefits over traditional computing models:

Scalability
Elasticity
Agility
High availability (and geo-distribution)
Disaster recovery
Cost model

These operational benefits may be an inherent built-in platform function that provides features as part of the service, as is typically the case with PaaS or Function as a Service (FaaS) and SaaS. These operational benefits just need to be enabled in some cases, if not automatically included as part of the services.

These operational benefits could also be something that needs to be designed into part of the solution as an individual set of resources that need to be implemented to enable these characteristics.

For example, IaaS virtual machines will not provide scale, elasticity, high availability, and disaster recovery without these being designed into the solution and then implementing resources to provide the functionality to provide each of these characteristics.

The key takeaway is that cloud platform providers will generally provide these functions and characteristics. You may layer on additional functionality as your needs dictate.

Of course, not everything is perfect with the cloud computing model. Here are some challenges that can be overcome but must be considered and provided for:

Network dependency—that is, reliability, stability, quality, and performance
Confidentiality, Integrity, Availability (CIA) of users, apps, and data
Access control and operational governance
Cost control

With the operational benefits and challenges considered in this section, it is time to look at the benefits of cloud computing in more detail.

What Is Scalability?

Scalability refers to how to react to and increase resources based on demand, usually in an automated way triggered upon a metric such as a time or resource threshold being reached. The following two concepts are related to the scalability of computing resources:

Scaling up (vertical scaling): This means capacity is increased within the resource, such as increasing the processor or memory by resizing a virtual machine; the opposite is “scaling down,” where resource capacity is decreased.
Scaling out (horizontal scaling): This means additional resource instances, such as adding other virtual machines or compute node/scale units; the opposite is “scaling in,” where resource instances are de-allocated.

Scalability should not be confused with fault tolerance, which moves a workload automatically to another resource or system when it detects a failure or unhealthy state.

What Is Elasticity?

Elasticity refers to the ability to shape the resources needed automatically, to burst and scale to meet any peak in demand, and to return to a normal operating baseline.

What Is Agility?

Agility means deploying and configuring resources effectively and efficiently in a short space of time to meet any change in requirements or operational needs.

What Is High Availability?

High availability and geo-distribution mean deploying resources to operate within the required or mandated Service-Level Agreement (SLA) for those resources. An SLA sets out an expected level of service that a customer can expect from their service provider. This agreement will set out terms such as availability metrics, service availability, responsibilities, claims, and credit processes, as well as the vocabulary and terminology that will be used to express these aspects of the agreement.

The SLA is a guaranteed measure of uptime, which is the amount of time services are online, available, and operational. The following are the concept of availability in the context of computing and systems:

Availability is the percentage of time a resource is available to service requests.
Service availability is expressed as the uptime percentage over time, for example, 99.9%.
Availability depends on resilient systems, meaning that a system can continue to function after recovering from failures.
Increasing availability often results in an increase in costs due to the complexity of the solutions required to deliver the level of availability.
Failover is another critical factor in availability. This means one system takes over from another when a resource fails and becomes unavailable and is part of an availability and disaster recovery strategy.

Microsoft defines an SLA as follows:

Microsoft’s commitments to uptime and connectivity, meaning the amount of time the services are online, available, and operational.

Microsoft provides each service with an individual SLA that will detail what is covered by the agreement and any exceptions. For any service that does not meet the guarantees, a percentage of the monthly fees are eligible to be credited; each service has its own defined SLA.

While you see lots of references to availability and uptime when looking at an SLA that will be provided for a service, the customer and consumer of the services will want to know what that means in the real world and what impact any breach may have on them. Therefore, it is often the case that the real metric that matters is downtime, which means for a given SLA, how long is that service permitted to be down (that is, the service is not available from the service provider)? You should scrutinize any SLA to determine whether that level of downtime is acceptable.

The service availability depends on the number of nines (as in the three nines is 99.9% and five nines is 99.999%) of the SLA. Microsoft SLAs are expressed on a monthly basis, so 99.9% would have an allowed service downtime of 43.2 minutes per month.

Table 2.1 illustrates examples of SLA commitments and downtime permitted per month as part of an SLA:

SLA of a Service	Permitted Downtime Per Month
99.9%	43m 28s
99.95%	21m 44s
99.99%	4m 21s
99.999%	26s

Table 2.1 – The SLA for a service indicating the acceptable level of downtime per month

Observe that 99.9% is the minimum SLA that Microsoft provides; 99.999 % is the maximum. It should be noted that 100% cannot be provided by Microsoft.

You should also be aware of the concept of a composite SLA; this means that when you combine services (such as virtual machines and the underlying services such as storage, networking components, and so on), the overall SLA is lower than the individual highest SLA on one of the services. This is because each service that you add increases the probability of failure and increases complexity.

The following actions will “positively” impact and “increase” your SLA:

Using services that provide an SLA (or improve the service SLA), such as Entra ID Premium editions and Premium SSD managed disks
Adding redundant resources, such as resources to additional/multiple regions
Adding availability solutions, such as using availability sets and availability zones

The following actions will “negatively” impact and “decrease” your SLA:

Adding multiple services due to the nature of composite SLAs
Choosing non-SLA-backed services or free services

The following actions will have no impact on your SLA:

Adding multiple tenancies
Adding multiple subscriptions
Adding multiple admin accounts

The Azure status page (https://packt.link/DdVgV) provides a global overview of the service health across all regions; this should be the first place you visit, should you suspect there is a wider issue affecting the availability of services globally. From the status page, you can click through to Azure Service Health in the Azure portal, which provides a personalized view of the availability of the services that are being used within your Azure subscriptions.

Service credits are paid through a claims process by a service provider when they do meet the guarantees of the agreed service level; each service has its own defined SLA. You should evaluate all your services to ensure that, where required, you always have an SLA-backed service; as they say, there is often an operational impact that’s felt from “free services”.

If you suspect that your services have been affected and that Microsoft has not been able to meet its SLA, then it is your responsibility to take action and pursue credit; you must submit a claim to receive service credit. For most services, you must submit the claim the month after the month the service was impacted. If your services are provided through the Microsoft Cloud Solution Provider (CSP) channel, they will pursue this claim on your behalf and provide the service refunds accordingly.

Note

You can find more details about Microsoft SLAs for Azure services and the composite SLA at https://packt.link/X5G0B and https://packt.link/qzOnf.

What Is Disaster Recovery?

Disaster recovery is based upon a set of practices or measures to ensure that, when a system fails, it can be restored to operation by failing over to a replicated instance in another region.

A “disaster recovery strategy” will be determined by the required Recovery Time Objective (RTO) and Recovery Point Objective (RPO). Replication technologies allow for much shorter RTOs and RPOs that can be achieved with backups. The following are the crucial elements in creating comprehensive disaster recovery plans:

RTO: This refers to the maximum duration of acceptable downtime for the system.
RPO: This refers to how much data loss is acceptable to a system.

This is represented in Figure 2.4:

Figure 2.4: The image depicts a Timeline graphic that depicts the concepts and contrasts between RPO and RTO

Figure 2.4 – RTO and RPO

Having grasped the operational benefits, read on to compare disaster recovery to high availability and backup concepts.

Comparing Disaster Recovery, High Availability, and Backup

High availability and disaster recovery can be classified as system protection, whereas backup can be classified as data protection. The following concepts help in building robust and resilient systems in cloud computing environments like Microsoft Azure:

High availability: When systems fail and are not available, you can run a second instance in the same Azure region.
Disaster recovery: When systems fail and are not available, you can run a second instance in another Azure region.
Backup: When data is corrupted, deleted, lost, or irretrievable (perhaps due to ransomware), you can restore the instance from another copy of the system.

Figure 2.5 outlines the three preceding points of high availability, disaster recovery, and backup:

Figure 2.5: The image shows a Multi-part shape that compares and contrasts the concepts and solutions for protecting systems and protecting data

Figure 2.5 – Comparing backup, high availability, and disaster recovery

High availability, disaster recovery, and backup should not be an “either-or” decision in a strategy for business continuity; any strategy should include “all three” as they serve different purposes.

Fault tolerance is a means of providing high availability in systems. It is similar to Auto Scale, in which workloads can be moved from one system to another. The trigger for fault tolerance is a health check on a failed system, as opposed to a system under load from demand.

Challenges of Implementing Business Continuity

Cost, complexity, and compliance are the biggest challenges for business continuity. These challenges result in systems that are often not covered by disaster recovery or protected by backups, which challenges your ability to comply with any regulatory or internal mandatory policy.

While you may be familiar with the traditional causes of a disaster or business disruption, a threat to business operations can also come from a “global pandemic.” Mitigation and planning for a pandemic have not often been included in a disaster recovery or business continuity strategy.

While not a disaster or outage, a “pandemic” certainly causes a significant business disruption that almost nobody can foresee. It is reasonable to say that those who had already adopted some form of cloud services and a remote working strategy before the COVID-19 pandemic were probably better prepared than others.

Figure 2.6 shows that when you adopt a cloud computing model, your cost model changes; you may have reduced complexity, and your compliance levels may increase:

Figure 2.6: Three-part shape with three icons that depict the three challenges to implementing business continuity as the cloud is adopted. They are Cost, Complexity, and Compliance

Figure 2.6 – Challenges to implementing business continuity

Adopting a cloud strategy utilizing Microsoft Azure can address many of these challenges. The challenges are often centered around costs, and the benefit and driver can be the changing cost model that can be provided by the cloud.

An additional benefit is that there is no need to maintain and purchase the resources required for a secondary site. With Microsoft Azure as the secondary site, only what is used is paid for in a consumption-based model.

From the content in this section, you have now learned about the cloud computing operations model, including aspects of the demand model, and operational benefits, as well as comparing disaster recovery, high availability, and backup. The following section will cover the economics of cloud computing, the consumption model, and the cost-expenditure model.

You have been reading a chapter from

Microsoft Azure Fundamentals Certification and Beyond - Second Edition

Published in: Jan 2024Publisher: PacktISBN-13: 9781837630592

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €14.99/month. Cancel anytime

Author (1)

Steve Miles

Steve Miles works in a technology leadership role for the cloud practice of a multi-billion turnover IT distributor based in the UK and Ireland. He is a Microsoft Azure MVP (Most Valuable Professional), MCT (Microsoft Certified Trainer) and Microsoft technologies author. Steve has more than 25 years of experience in hosted datacenter services, hybrid, and multi-cloud platforms. In his free time, Steve also can be found tinkering on cars.
Read more about Steve Miles

Personalised recommendations for you

Based on your interests and search pattern

Designing and Implementing Microsoft Azure Networking Solutions

Designing and Implementing Microsoft Azure Networking Solutions Exam Ref AZ-700 is an all-encompassing guide to the AZ-700 exam and contains all the information you need to succeed in the world of virtual networking with Azure. With this book, you will be fully prepared for the exam and the world of cloud networking.

BookAug 2023524 pages

Microsoft 365 Security, Compliance, and Identity Administration

The Microsoft 365 Security, Compliance, and Identity Administration is a comprehensive guide that helps you employ Microsoft 365's robust suite of features and empowers you to optimize your administrative tasks.

BookAug 2023630 pages

Zero Trust Overview and Playbook Introduction

Get started on Zero Trust with this step-by-step playbook and learn everything you need to know for a successful Zero Trust journey with tailored guidance for every role, covering strategy, operations, architecture, implementation, and measuring success. This book will become an indispensable reference for everyone in your organization.

BookOct 2023240 pages

The Self-Taught Cloud Computing Engineer

This self-study book helps you master multiple clouds, including AWS, Azure, and GCP, and serves as a roadmap to becoming a certified cloud computing expert. The book will guide you to develop a professional cloud career by helping you build a broad cloud knowledge base, developing hands-on cloud computing skills, and getting cloud certified.

BookSep 2023472 pages

Technology Operating Models for Cloud and Edge

This book will help you build and create ownership of a technology operating model, as well as connect your leadership with engineering and operations, keeping your internal and external customers in mind. It provides practical tips on why, where, and how to make the cloud and edge platform paradigm sing for you, your team, and your organization.

BookAug 2023228 pages

Azure Architecture Explained

Azure is the preferred platform to build mission-critical and secure apps. This book provides comprehensive coverage of essential Azure products, services, and solutions vital for every solution architect's success. Elevate your knowledge and master the critical components of Azure to excel in your role with Azure Architecture Explained.

BookSep 2023446 pages

Pentesting Active Directory and Windows-based Infrastructure

This practical guide helps you explore the pentesting of Microsoft infrastructure in detail, and enhances your offensive skillset by showing you the different ways to perform security assessment. This book will help blue teamers and IT engineers get up to speed with possible security issues they may encounter in their Windows environments.

BookNov 2023360 pages

Practical Ansible

In Practical Ansible, you'll work with the latest release of Ansible and learn to solve complex issues quickly with the help of task-oriented scenarios. You'll start by installing and configuring Ansible to automate monotonous and repetitive IT tasks and get to grips with concepts such as playbooks, inventories, plugins, collections, and network modules.

BookSep 2023420 pages

Windows 11 for Enterprise Administrators

Microsoft’s launch of Windows 11 is a step toward satisfying the enterprise administrator’s needs for better management and enhanced user experience customization. This book provides the enterprise administrator with the knowledge needed to fully utilize the advanced feature set of Windows 11 Enterprise.

BookOct 2023286 pages

The Linux DevOps Handbook

This book is for software and IT professionals seeking knowledge on Linux systems and DevOps practices. This book will provide you with guidance and tools to learn and gain proficiency in managing Linux-based infrastructures and knowledge of DevOps.

BookNov 2023428 pages2