Reader small image

You're reading from  Getting Started with Kubernetes, Second Edition - Second Edition

Product typeBook
Published inMay 2017
PublisherPackt
ISBN-139781787283367
Edition2nd Edition
Right arrow
Author (1)
Jonathan Baier
Jonathan Baier
author image
Jonathan Baier

Jonathan Baier is an emerging technology leader living in Brooklyn, New York. He has had a passion for technology since an early age. When he was 14 years old, he was so interested in the family computer (an IBM PCjr) that he pored over the several hundred pages of BASIC and DOS manuals. Then, he taught himself to code a very poorly-written version of Tic-Tac-Toe. During his teenage years, he started a computer support business. Throughout his life, he has dabbled in entrepreneurship. He currently works as Senior Vice President of Cloud Engineering and Operations for Moody's corporation in New York.
Read more about Jonathan Baier

Right arrow

Chapter 4. Updates, Gradual Rollouts, and Autoscaling

This chapter will expand upon the core concepts, which show you how to roll out updates and test new features of your application with minimal disruption to up-time. It will cover the basics of doing application updates, gradual rollouts, and A/B testing. In addition, we will look at scaling the Kubernetes cluster itself.

This chapter will discuss the following topics:

  • Application scaling
  • Rolling updates
  • A/B testing
  • Application autoscaling
  • Scaling up your cluster

Since version 1.2, Kubernetes has released a Deployments API. Deployments are the recommended way to deal with scaling and application updates going forward. However, it is still considered beta at the time of writing this book, while rolling updates has been stable for several versions. We will explore rolling updates in this chapter as an introduction to the scaling concept and then dive into the preferred method of using deployments in the next chapter.

Example set up


Before we start exploring the various capabilities built into Kubernetes for scaling and updates, we will need a new example environment. We are going to use a variation of our previous container image with a blue background (refer to the v0.1 and v0.2 (side by side) image, later in this chapter, for a comparison). We have the following code:

apiVersion: v1 
kind: ReplicationController 
metadata: 
  name: node-js-scale 
  labels: 
    name: node-js-scale 
spec: 
  replicas: 1 
  selector: 
    name: node-js-scale 
  template: 
    metadata: 
      labels: 
        name: node-js-scale 
    spec: 
      containers: 
      - name: node-js-scale 
        image: jonbaier/pod-scaling:0.1 
        ports: 
        - containerPort: 80

Listing 4-1: pod-scaling-controller.yaml

apiVersion: v1 
kind: Service 
metadata: 
  name: node-js-scale 
  labels: 
    name: node-js-scale 
spec: 
  type: LoadBalancer 
  sessionAffinity: ClientIP 
  ports: 
  - port: 80 
  selector: 
    name: node-js...

Scaling up


Over time, as you run your applications in the Kubernetes cluster, you will find that some applications need more resources, whereas others can manage with fewer resources. Instead of removing the entire RC (and associated pods), we want a more seamless way to scale our application up and down.

Thankfully, Kubernetes includes a scale command, which is suited specifically for this purpose. The scale command works both with Replication Controllers and the new Deployments abstraction. For now, we will explore the usage with Replication Controllers. In our new example, we have only one replica running. You can check this with a get pods command:

$ kubectl get pods -l name=node-js-scale

Let's try scaling that up to three with the following command:

$ kubectl scale --replicas=3 rc/node-js-scale

If all goes well, you'll simply see the scaled word on the output of your terminal window.

Note

Optionally, you can specify the --current-replicas flag as a verification step. The scaling will only...

Smooth updates


The scaling of our application up and down as our resource demands change is useful for many production scenarios, but what about simple application updates? Any production system will have code updates, patches, and feature additions. These could be occurring monthly, weekly, or even daily. Making sure that we have a reliable way to push out these changes without interruption to our users is a paramount consideration.

Once again, we benefit from the years of experience the Kubernetes system is built on. There is a built-in support for rolling updates with the 1.0 version. The rolling-update command allows us to update entire RCs or just the underlying Docker image used by each replica. We can also specify an update interval, which will allow us to update one pod at a time and wait until proceeding to the next.

Let's take our scaling example and perform a rolling update to the 0.2 version of our container image. We will use an update interval of 2 minutes, so we can watch the...

Testing, releases, and cutovers


The rolling update feature can work well for a simple blue-green deployment scenario. However, in a real-world blue-green deployment with a stack of multiple applications, there can be a variety of interdependencies that require in-depth testing. The update-period command allows us to add a timeout flag where some testing can be done, but this will not always be satisfactory for testing purposes.

Similarly, you may want partial changes to persist for a longer time and all the way up to the load balancer or service level. For example, you may wish to run an A/B test on a new user interface feature with a portion of your users. Another example is running a canary release (a replica in this case) of your application on new infrastructure like a newly added cluster node.

Let's take a look at an A/B testing example. For this example, we will need to create a new service that uses sessionAffinity. We will set the affinity to ClientIP, which will allow us to forward...

Application autoscaling


A recent feature addition to Kubernetes is that of the Horizontal Pod Autoscaler. This resource type is really useful as it gives us a way to automatically set thresholds for scaling our application. Currently, that support is only for CPU, but there is alpha support for custom application metrics as well. 

Let's use the node-js-scale Replication Controller from the beginning of the chapter and add an autoscaling component. Before we start, let's make sure we are scaled back down to one replica using the following command:

$ kubectl scale --replicas=1 rc/node-js-scale

Now we can create a Horizontal Pod Autoscaler with the following hpa definition:

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: node-js-scale
spec:
  minReplicas: 1
  maxReplicas: 3
  scaleTargetRef:
    apiVersion: v1
    kind: ReplicationController
    name: node-js-scale
  targetCPUUtilizationPercentage: 20

Listing 4-6. node-js-scale-hpa.yaml

Go ahead and create this with the...

Scaling a cluster


All these techniques are great for the scaling of the application, but what about the cluster itself. At some point, you will pack the nodes full and need more resources to schedule new pods for your workloads.

Autoscaling

When you create your cluster, you can customize the starting number of nodes (minions) with the NUM_MINIONS environment variable. By default, it is set to 4.

Additionally, the Kubernetes team has started to build autoscaling capability into the cluster itself. Currently, this is the only support on GCE and GKE, but work is being done on other providers. This capability utilizes the KUBE_AUTOSCALER_MIN_NODES, KUBE_AUTOSCALER_MAX_NODES, and KUBE_ENABLE_CLUSTER_AUTOSCALER environment variables.

The following example shows how to set the environment variables for autoscaling before running kube-up.sh:

$ export NUM_MINIONS=5
$ export KUBE_AUTOSCALER_MIN_NODES=2
$ export KUBE_AUTOSCALER_MAX_NODES=5
$ export KUBE_ENABLE_CLUSTER_AUTOSCALER=true

Also, bear in mind...

Summary


We should now be a bit more comfortable with the basics of application scaling in Kubernetes. We also looked at the built-in functions in order to roll updates as well as a manual process for testing and slowly integrating updates. We took a look at how to scale the nodes of our underlying cluster and increase the overall capacity for our Kubernetes resources. Finally, we explored some of the new auto scaling concepts for both the cluster and our applications themselves. 

In the next chapter, we will look at the latest techniques for scaling and updating applications with the new deployments resource type, as well as some of the other types of workloads we can run on Kubernetes.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Getting Started with Kubernetes, Second Edition - Second Edition
Published in: May 2017Publisher: PacktISBN-13: 9781787283367
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Jonathan Baier

Jonathan Baier is an emerging technology leader living in Brooklyn, New York. He has had a passion for technology since an early age. When he was 14 years old, he was so interested in the family computer (an IBM PCjr) that he pored over the several hundred pages of BASIC and DOS manuals. Then, he taught himself to code a very poorly-written version of Tic-Tac-Toe. During his teenage years, he started a computer support business. Throughout his life, he has dabbled in entrepreneurship. He currently works as Senior Vice President of Cloud Engineering and Operations for Moody's corporation in New York.
Read more about Jonathan Baier