Reader small image

You're reading from  The DevOps 2.5 Toolkit

Product typeBook
Published inNov 2019
PublisherPackt
ISBN-139781838647513
Edition1st Edition
Concepts
Right arrow
Author (1)
Viktor Farcic
Viktor Farcic
author image
Viktor Farcic

Viktor Farcic is a senior consultant at CloudBees, a member of the Docker Captains group, and an author. He codes using a plethora of languages starting with Pascal (yes, he is old), Basic (before it got the Visual prefix), ASP (before it got the .NET suffix), C, C++, Perl, Python, ASP.NET, Visual Basic, C#, JavaScript, Java, Scala, and so on. He never worked with Fortran. His current favorite is Go. Viktor's big passions are Microservices, Continuous Deployment, and Test-Driven Development (TDD). He often speaks at community gatherings and conferences. Viktor wrote Test-Driven Java Development by Packt Publishing, and The DevOps 2.0 Toolkit. His random thoughts and tutorials can be found in his blog—Technology Conversations
Read more about Viktor Farcic

Right arrow

Auto-scaling Nodes of a Kubernetes Cluster

May I say that I have not thoroughly enjoyed serving with humans? I find their illogic and foolish emotions a constant irritant.

- Spock

Usage of HorizontalPodAutoscaler (HPA) is one of the most critical aspects of making a resilient, fault-tolerant, and highly-available system. However, it is of no use if there are no nodes with available resources. When Kubernetes cannot schedule new Pods because there's not enough available memory or CPU, new Pods will be unschedulable and in the pending status. If we do not increase the capacity of our cluster, pending Pods might stay in that state indefinitely. To make things more complicated, Kubernetes might start removing other Pods to make room for those that are in the pending state. That, as you might have guessed, might lead to worse problems than the issue of our applications not having...

Creating a cluster

We'll continue using definitions from the vfarcic/k8s-specs (https://github.com/vfarcic/k8s-specs) repository. To be on the safe side, we'll pull the latest version first.

All the commands from this chapter are available in the 02-ca.sh (https://gist.github.com/vfarcic/a6b2a5132aad6ca05b8ff5033c61a88f) Gist.
 1  cd k8s-specs
 2
 3  git pull

Next, we need a cluster. Please use the Gists below as inspiration to create a new cluster or to validate that the one you already fulfills all the requirements.

A note to AKS users
At the time of this writing (October 2018), Cluster Autoscaler does not (always) work in Azure Kubernetes Service (AKS). Please jump to Setting up Cluster Autoscaler in AKS section for more info and the link to instructions how to set it up.
  • gke-scale.sh: GKE with 3 n1-standard-1 worker nodes, with tiller, and with the --enable-autoscaling...

Setting up Cluster Autoscaling

We might need to install Cluster Autoscaler before we start using it. I said that we might, instead of saying that we have to because some Kubernetes flavors do come with Cluster Autoscaler baked in, while others don't. We'll go through each of the "big three" managed Kubernetes clusters. You might choose to explore all three of them, or to jump to the one you prefer. As a learning experience, I believe that it is beneficial to experience running Kubernetes in all three providers. Nevertheless, that might not be your view and you might prefer using only one. The choice is yours.

Setting up Cluster Autoscaler in GKE

This will be the shortest section ever written. There&apos...

Scaling up the cluster

The objective is to scale the nodes of our cluster to meet the demand of our Pods. We want not only to increase the number of worker nodes when we need additional capacity, but also to remove them when they are underused. For now, we'll focus on the former, and explore the latter afterward.

Let's start by taking a look at how many nodes we have in the cluster.

 1  kubectl get nodes

The output, from GKE, is as follows.

NAME             STATUS ROLES  AGE   VERSION
gke-devops25-... Ready  <none> 5m27s v1.9.7-gke.6
gke-devops25-... Ready  <none> 5m28s v1.9.7-gke.6
gke-devops25-... Ready  <none> 5m24s v1.9.7-gke.6

In your case, the number of nodes might differ. That's not important. What matters is to remember how many you have right now since that number will change soon.

Let's take a look at the definition of the go-demo...

The rules governing nodes scale-up

Cluster Autoscaler monitors Pods through a watch on Kube API. It checks every 10 seconds whether there are any unschedulable Pods (configurable through the --scan-interval flag). In that context, a Pod is unschedulable when the Kubernetes Scheduler is unable to find a node that can accommodate it. For example, a Pod can request more memory than what is available on any of the worker nodes.

Cluster Autoscaler assumes that the cluster is running on top of some kind of node groups. As an example, in the case of AWS, those groups are Autoscaling Groups (ASGs). When there is a need for additional nodes, Cluster Autoscaler creating a new node by increasing the size of a node group.

Cluster Autoscaler assumes that requested nodes will appear within 15 minutes (configurable through the --max-node-provision-time flag). If that period expires and a new...

Scaling down the cluster

Scaling up the cluster to meet the demand is essential since it allows us to host all the replicas we need to fulfill (some of) our SLAs. When the demand drops and our nodes become underutilized, we should scale down. That is not essential given that our users will not experience problems caused by having too much hardware in our cluster. Nevertheless, we shouldn't have underutilized nodes if we are to reduce expenses. Unused nodes result in wasted money. That is true in all situations, especially when running in Cloud and paying only for the resources we used. Even on-prem, where we already purchased hardware, it is essential to scale down and release resources so that they can be used by other clusters.

We'll simulate a decrease in demand by applying a new definition that will redefine the HPAs threshold to 2 (min) and 5 (max).

 1  kubectl...

The rules governing nodes scale-down

Cluster Autoscaler iterates every 10 seconds (configurable through the --scan-interval flag). If the conditions for scaling up are not met, it checks whether there are unneeded nodes.

It will consider a node eligible for removal when all of the following conditions are met.

  • The sum of CPU and memory requests of all Pods running on a node is less than 50% of the node's allocatable resources (configurable through the --scale-down-utilization-threshold flag).
  • All Pods running on the node can be moved to other nodes. The exceptions are those that run on all the nodes like those created through DaemonSets.

Whether a Pod might not be eligible for rescheduling to a different node when one of the following conditions are met.

  • A Pod with affinity or anti-affinity rules that tie it to a specific node.
  • A Pod that uses local storage.
  • A Pod created...

Can we scale up too much or de-scale to zero nodes?

If we let Cluster Autoscaler do its "magic" without defining any thresholds, our cluster or our wallet might be at risk.

We might, for example, misconfigure HPA and end up scaling Deployments or StatefulSets to a huge number of replicas. As a result, Cluster Autoscaler might add too many nodes to the cluster. As a result, we could end up paying for hundreds of nodes, even though we need much less. Luckily, AWS, Azure, and GCP limit how many nodes we can have so we cannot scale to infinity. Nevertheless, we should not allow Cluster Autoscaler to go over some limits.

Similarly, there is a danger that Cluster Autoscaler will scale down to too few nodes. Having zero nodes is almost impossible since that would mean that we have no Pods in the cluster. Still, we should maintain a healthy minimum of nodes, even if that means...

Cluster Autoscaler compared in GKE, EKS, and AKS

Cluster Autoscaler is a prime example of the differences between different managed Kubernetes offerings. We'll use it to compare the three major Kubernetes-as-a-Service providers.

I'll limit the comparison between the vendors only to the topics related to Cluster Autoscaling.

GKE is a no-brainer for those who can use Google to host their cluster. It is the most mature and feature-rich platform. They started Google Kubernetes Engine (GKE) long before anyone else. When we combine their head start with the fact that they are the major contributor to Kubernetes and hence have the most experience, it comes as no surprise that their offering is way above others.

When using GKE, everything is baked into the cluster. That includes Cluster Autoscaler. We do not have to execute any additional commands. It simply works out of the...

What now?

There's not much left to say about Cluster Autoscaler.

We finished exploring fundamental ways to auto-scale Pods and nodes. Soon we'll dive into more complicated subjects and explore things that are not "baked" into a Kubernetes cluster. We'll go beyond the core project and introduce a few new tools and processes.

This is the moment when you should destroy your cluster if you're not planning to move into the next chapter right away and if your cluster is disposable (for example, not on bare-metal). Otherwise, please delete the go-demo-5 Namespace to remove the resources we created in this chapter.

 1  kubectl delete ns go-demo-5

Before you leave, you might want to go over the main points of this chapter.

  • Cluster Autoscaler has a single purpose to adjust the size of the cluster by adding or removing worker nodes. It adds new nodes when...
lock icon
The rest of the chapter is locked
You have been reading a chapter from
The DevOps 2.5 Toolkit
Published in: Nov 2019Publisher: PacktISBN-13: 9781838647513
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Author (1)

author image
Viktor Farcic

Viktor Farcic is a senior consultant at CloudBees, a member of the Docker Captains group, and an author. He codes using a plethora of languages starting with Pascal (yes, he is old), Basic (before it got the Visual prefix), ASP (before it got the .NET suffix), C, C++, Perl, Python, ASP.NET, Visual Basic, C#, JavaScript, Java, Scala, and so on. He never worked with Fortran. His current favorite is Go. Viktor's big passions are Microservices, Continuous Deployment, and Test-Driven Development (TDD). He often speaks at community gatherings and conferences. Viktor wrote Test-Driven Java Development by Packt Publishing, and The DevOps 2.0 Toolkit. His random thoughts and tutorials can be found in his blog—Technology Conversations
Read more about Viktor Farcic