Chapter 10. Designing for High Availability and Scalability
This chapter will cover advanced concepts such as high availability, scalability, and the requirements that Kubernetes operators will need to cover in order to begin to explore the topic of running Kubernetes in production. We'll take a look at the Platform as a Service (PaaS) offerings from Google and Azure and we'll use the familiar principles of running production workloads in a cloud environment.
We'll cover the following topics in this chapter:
- Introduction to high availability
- High availability best practices
- Multi-region setups
- Security best practices
- Setting up high availability on the hosted Kubernetes PaaS
- Cluster life cycle events
- How to use admission controllers
- Getting involved with the workloads API
- What is a custom resource definition (CRD)?
You'll need to have access to your Google Cloud Platform account in order to explore some of these options. You can also use a local Minikube setup to test some of these features, but many of the principles and approaches we'll discuss here require servers in the cloud.
Introduction to high availability
In order to understand our goals for this chapter, we first need to talk about the more general terms of high availability and scalability. Let's look at each individually to understand how the pieces work together.
We'll discuss the required terminology and begin to understand the building blocks that we'll use to conceptualize, construct, and run a Kubernetes cluster in the cloud.
Let's dig into high availability, uptime, and downtime.
How do we measure availability?
High availability (HA) is the idea that your application is available, meaning reachable, to your end users. In order to create highly available applications, your application code and the frontend that users interact with needs to be available the majority of the time. This term comes from the system design field, which defines the architecture, interface, data, and modules of a system in order to satisfy a given set of requirements. There are many examples of system design in disciplines from...
In order to build HA Kubernetes systems, it's important to note that availability is as often a function of people and process as it is a failure in technology. While hardware and software fails often, humans and their involvement in the process is a very predictable drag on the availability of all systems.
It's important to note that this book won't get into how to design a microservices architecture for failure, which is a huge part of coping with some (or all) system failures in a cluster scheduling and networking system such as Kubernetes.
There's another important concept that's important to consider: graceful degradation.
Graceful degradation is the idea that you build functionality in layers and modules, so even with the catastrophic failure of some pieces of the system, you're still able to provide some level of availability. There is a corresponding term for the progressive enhancement that's followed in web design, but we won't be using that pattern here. Graceful...
There are a few more key items that we should cover so that you're armed with full knowledge about the items that can help you with creating highly available Kubernetes clusters. Let's discuss how you can use admission controllers, workloads, and custom resource definitions to extend your cluster.
Admission controllers are Kubernetes code that allows you to intercept a call to the Kubernetes API server after it has been authenticated and authorized. There are standard admission controllers that are included with the core Kubernetes system, and people also write their own. There are two controllers that are more important than the rest:
- The
MutatingAdmissionWebhook
is responsible for calling Webhooks
that mutate, in serial, a given request. This controller only runs during the mutating phase of cluster operating. You can use a controller like this in order to build business logic into your cluster to customize admission logic with operations such as CREATE...
In this chapter, we looked into the core components of HA. We explored the ideas of availability, uptime, and fragility. We took those concepts and explored how we could achieve five nines of uptime.
Additionally, we explored the key components of a highly available cluster, the etcd and control plane nodes, and left room to imagine the other ways that we'd build HA into our clusters using hosted PaaS offerings from the major cloud providers.
Later, we looked at the cluster life cycle and dug into advanced capabilities with a number of key features of the Kubernetes system: admission controllers, the workload API, and CRS.
Lastly, we created a CRD on a GKE cluster within GCP in order to understand how to begin building these custom pieces of software.
If you'd like to read more about high availability and mastering Kubernetes, check out the following Packt resources: