Chapter 4. Implementing Reliable Container-Native Applications
This chapter will cover the various types of workloads that Kubernetes supports. We will cover deployments for applications that are regularly updated and long-running. We will also revisit the topics of application updates and gradual rollouts using Deployments. In addition, we will look at jobs used for short-running tasks. We will look at DaemonSets, which allow programs to be run on every node in our Kubernetes cluster. In case you noticed, we won't look into StatefulSets yet in this chapter but we'll investigate them in the next, when we look at store and how K8s helps you manage storage and stateful applications on your cluster.
The following topics will be covered in this chapter:
- Deployments
- Application scaling with Deployments
- Application updates with Deployments
- Jobs
- DaemonSets
How Kubernetes manages state
As discussed previously, we know that Kubernetes makes an effort to enforce the desired state of the operator in a given cluster. Deployments give operators the ability to define an end state and the mechanisms to effect change at a controlled rate of stateless services, such as microservices. Since Kubernetes is a control and data plane that manages the metadata, current status, and specification of a set of objects, Deployments provide a deeper level of control for your applications. There are a few archetypal deployment patterns that are available: recreate, rolling update, blue/green via selector, canary via replicas, and A/B via HTTP headers.
In the previous chapter, we explored some of the core concepts for application updates using the old rolling-update method. Starting with version 1.2, Kubernetes added the Deployment construct, which improves on the basic mechanisms of rolling-update and ReplicationControllers. As the name suggests, it gives us finer control over the code deployment itself. Deployments allow us to pause and resume application rollouts via declarative definitions and updates to pods and ReplicaSets. Additionally, they keep a history of past deployments and allow the user to easily roll back to previous versions.
Note
It is no longer recommended to use ReplicationControllers. Instead, use a Deployment that configures a ReplicaSet in order to set up application availability for your stateless services or applications. Furthermore, do not directly manage the ReplicaSets that are created by your deployments; only do so through the Deployment API.
We'll explore a number of typical...
Deployments and ReplicationControllers are a great way to ensure long-running applications are always up and able to tolerate a wide array of infrastructure failures. However, there are some use cases this does not address, specifically short-running, run once tasks, as well as regularly scheduled tasks. In both cases, we need the tasks to run until completion, but then terminate and start again at the next scheduled interval.
To address this type of workload, Kubernetes has added a batch
API, which includes the Job
type. This type will create 1 to n pods and ensure that they all run to completion with a successful exit. Based on restartPolicy
, we can either allow pods to simply fail without retry (restartPolicy: Never
) or retry when a pods exits without successful completion (restartPolicy: OnFailure
). In this example, we will use the latter technique as shown in the listing longtask.yaml
:
apiVersion: batch/v1
kind: Job
metadata:
name: long-task
spec:
template:
metadata:
...
While ReplicationControllers and Deployments are great at making sure that a specific number of application instances are running, they do so in the context of the best fit. This means that the scheduler looks for nodes that meet resource requirements (available CPU, particular storage volumes, and so on) and tries to spread across the nodes and zones.
This works well for creating highly available and fault tolerant applications, but what about cases where we need an agent to run on every single node in the cluster? While the default spread does attempt to use different nodes, it does not guarantee that every node will have a replica and, indeed, will only fill a number of nodes equivalent to the quantity specified in the ReplicationController or Deployment specification.
To ease this burden, Kubernetes introduced DaemonSet
, which simply defines a pod to run on every single node in the cluster or a defined subset of those nodes. This can be very useful for a number of production...
As mentioned previously, we can schedule DaemonSets to run on a subset of nodes as well. This can be achieved using something called nodeSelectors. These allow us to constrain the nodes a pod runs on, by looking for specific labels and metadata. They simply match key-value pairs on the labels for each node. We can add our own labels or use those that are assigned by default.
The default labels are listed in the following table: