You're reading from Multi-Cloud Strategy for Cloud Architects - Second Edition

Product typeBook

Published inApr 2023

PublisherPackt

ISBN-139781804616734

Edition2nd Edition

Tools

Azure AWS

Concepts

Cloud Computing

Author (1)

Jeroen Mulder

Conclusion: The Future of Multi-Cloud

This book has dealt with designing, implementing, and controlling a multi-cloud platform. We talked about five major clouds—Azure, AWS, GCP, Oracle Cloud, and Alibaba Cloud—and discussed strategies to get the best out of these clouds for our businesses. We discovered that building and managing in the cloud can be complex. Yet, the cloud will definitively grow. We will look at the future of the cloud in this final chapter.

The cloud will grow and multi-cloud will grow. The biggest challenge is how organizations can stay in control of their applications in a multi-cloud setting since the cloud can become very complex. Maybe Google has the answer: Site Reliability Engineering (SRE). SRE incorporates aspects of software engineering and applies them to infrastructure and operations problems. We will also use this chapter to introduce the concept of SRE and its main principles.

In this chapter, we’re going to cover the...

The growth and adoption of multi-cloud

In recent years, multi-cloud has emerged as a popular approach for businesses to manage their cloud infrastructure. Let’s recap the definition of multi-cloud one more time: we speak about multi-cloud when we use two or more cloud service providers to host and run applications and services. As we look toward the near future, we can expect to see continued developments in multi-cloud as businesses seek to take advantage of its benefits while managing its risks. We’ll talk about managing risks later in this chapter when we explore the concept of SRE.

One of the primary reasons that businesses are looking more into multi-cloud is the need for flexibility and agility. Multi-cloud allows businesses to avoid vendor lock-in and take advantage of the unique features and capabilities offered by different cloud providers. This allows them to optimize their applications and services for specific use cases, such as high-performance computing...

Understanding the concept of SRE

Originally, SRE was meant for mission-critical systems, but overall, it can be used to drive the DevOps process in a more efficient way. The goal is to enable developers to deploy infrastructure quickly and without errors. To achieve this, the deployment is fully automated. In this way of working, operators will not be swamped with requests to constantly onboard and manage more systems.

The original description of SRE as invented by Google is well over 400 pages long. In the Further reading section, a good book is listed to give you a real deep dive into SRE. This chapter is merely an introduction.

Key terms in SRE are service-level indicators (SLIs), SLO, and the error budget, or the number of failures that lead to the unavailability of a system. The terms are explained in more detail in the next paragraphs.

SLI and SLO differ from SLA, the service-level agreement. The SLA is an agreement between the supplier of a service and the end user...

Working with risk analysis in SRE

The basis of SRE is that reliability is something that you can design as part of the architecture of applications and systems. Next to that, reliability is also something that one can measure. According to SRE, reliability is a measurable quality, and that quality can be influenced by design decisions. Engineers can take measures to decrease the detection, response, and repair time, and they can develop systems in such a way that changes can be executed safely without causing any downtime. Architects can design fault-tolerant systems; engineers can develop them.

The major issue is it all comes at a cost, and whether systems really need to be fault-tolerant is a business decision, based on a business case. Already, in Chapter 1, Introduction to Multi-Cloud, we’ve learned that business cases are driven by risks. Let’s go over risk management one more time.

The basic rule is that risk = probability x impact. Enterprises use risk...

Applying monitoring principles in SRE

Reliability is a measurable quality. To be able to measure the quality of the systems and their reliability, teams need real-time information on the status of these systems. As mentioned in the previous section, the TTD is a crucial driver in calculating risk and, subsequently, determining the SLO. Observability is therefore critical in SRE. However, SRE stands with the principle that monitoring needs to be as simple as possible. It uses the four golden signals:

Latency: The time that a system needs to return a response.
Traffic: The amount of traffic that is placed on the system.
Errors: The number of requests placed on a system that fail completely or partially.
Saturation: The utilization of the maximum load that a system can handle.

Based on these signals, monitoring rules are defined. As the starting point in SRE is avoiding too much work for operations or toil, the monitoring rules follow the same philosophy...

Applying principles of SRE to multi-cloud—building and operating distributed systems

This book exists because a majority of enterprises are moving or developing systems in cloud environments. Today’s enterprises are in a constant transformation mode. This also means a big change in operations. To put it simply, they have to keep up with the speed of change. Traditional operations can’t handle this. We need SRE in the future of multi-cloud. SRE teams create reliable systems in cloud environments.

There are a couple of important rules for SRE to enable this:

Automate everything: Automation leads to consistency, but automation also enables scaling. This requires a very well-thought-out architecture. Automation enables issues to be fixed faster since it only has to be fixed in one place: the code. Automation makes sure that the proper code is distributed over all systems involved. With large distributed systems spanning various cloud platforms, this...

Summary

Systems are getting more complex for many reasons: customers constantly demand more functionality in applications. At the same time, systems need to be available 24/7 without interruption. Cloud platforms are very suitable to facilitate development at high speed, and thus we foresee cloud providers growing fast. In other words, the cloud will definitively grow. This comes with challenges for a lot of businesses. Throughout this book, we discovered that building and managing cloud environments can be complex.

The cloud will grow, and likely the complexity of the cloud will grow too. To ensure reliability, especially with systems that are truly multi-cloud and distributed across different platforms, we should adopt the principles of SRE. The most important principles of SRE have been discussed in this chapter. You should have an understanding of the methodology, based on determining the SLO, measuring the SLI, and working with error budgets.

We’ve learned that...

Questions

Risk analysis is important in SRE. What are the five risk strategies, often referred to as PRACT?
SRE mentions four golden signals in applying monitoring rules. Latency and traffic are two of them. Name the remaining two.
SRE has a specific term for manual work that is often repetitive and should be avoided. What’s that term?
Postmortem analysis is a key principle in SRE. True or false: Postmortem analysis is about finding the root cause and finding out who’s to blame for the error.

Jeroen Mulder is a certified enterprise and security architect, and he works with Fujitsu (Netherlands) as a Principal Business Consultant. Earlier, he was a Sr. Lead Architect, focusing on cloud and cloud native technology, at Fujitsu, and was later promoted to become the Head of Applications and Multi-Cloud Services. Jeroen is interested in the cloud technology, architecture for cloud infrastructure, serverless and container technology, application development, and digital transformation using various DevOps methodologies and tools. He has previously authored “Multi-Cloud Architecture and Governance”, “Enterprise DevOps for Architects”, and “Transforming Healthcare with DevOps4Care”.
Read more about Jeroen Mulder

Personalised recommendations for you

Based on your interests and search pattern

Designing and Implementing Microsoft Azure Networking Solutions

Designing and Implementing Microsoft Azure Networking Solutions Exam Ref AZ-700 is an all-encompassing guide to the AZ-700 exam and contains all the information you need to succeed in the world of virtual networking with Azure. With this book, you will be fully prepared for the exam and the world of cloud networking.

BookAug 2023524 pages

Microsoft 365 Security, Compliance, and Identity Administration

The Microsoft 365 Security, Compliance, and Identity Administration is a comprehensive guide that helps you employ Microsoft 365's robust suite of features and empowers you to optimize your administrative tasks.

BookAug 2023630 pages

Zero Trust Overview and Playbook Introduction

Get started on Zero Trust with this step-by-step playbook and learn everything you need to know for a successful Zero Trust journey with tailored guidance for every role, covering strategy, operations, architecture, implementation, and measuring success. This book will become an indispensable reference for everyone in your organization.

BookOct 2023240 pages

The Self-Taught Cloud Computing Engineer

This self-study book helps you master multiple clouds, including AWS, Azure, and GCP, and serves as a roadmap to becoming a certified cloud computing expert. The book will guide you to develop a professional cloud career by helping you build a broad cloud knowledge base, developing hands-on cloud computing skills, and getting cloud certified.

BookSep 2023472 pages

Technology Operating Models for Cloud and Edge

This book will help you build and create ownership of a technology operating model, as well as connect your leadership with engineering and operations, keeping your internal and external customers in mind. It provides practical tips on why, where, and how to make the cloud and edge platform paradigm sing for you, your team, and your organization.

BookAug 2023228 pages

Azure Architecture Explained

Azure is the preferred platform to build mission-critical and secure apps. This book provides comprehensive coverage of essential Azure products, services, and solutions vital for every solution architect's success. Elevate your knowledge and master the critical components of Azure to excel in your role with Azure Architecture Explained.

BookSep 2023446 pages

Pentesting Active Directory and Windows-based Infrastructure

This practical guide helps you explore the pentesting of Microsoft infrastructure in detail, and enhances your offensive skillset by showing you the different ways to perform security assessment. This book will help blue teamers and IT engineers get up to speed with possible security issues they may encounter in their Windows environments.

BookNov 2023360 pages

Practical Ansible

In Practical Ansible, you'll work with the latest release of Ansible and learn to solve complex issues quickly with the help of task-oriented scenarios. You'll start by installing and configuring Ansible to automate monotonous and repetitive IT tasks and get to grips with concepts such as playbooks, inventories, plugins, collections, and network modules.

BookSep 2023420 pages

Windows 11 for Enterprise Administrators

Microsoft’s launch of Windows 11 is a step toward satisfying the enterprise administrator’s needs for better management and enhanced user experience customization. This book provides the enterprise administrator with the knowledge needed to fully utilize the advanced feature set of Windows 11 Enterprise.

BookOct 2023286 pages

The Linux DevOps Handbook

This book is for software and IT professionals seeking knowledge on Linux systems and DevOps practices. This book will provide you with guidance and tools to learn and gain proficiency in managing Linux-based infrastructures and knowledge of DevOps.

BookNov 2023428 pages2