You're reading from Kubernetes in Production Best Practices

Product type Book

Published in Mar 2021

Publisher Packt

ISBN-13 9781800202450

Pages 292 pages

Edition 1st Edition

Languages

Concepts

Containerization

Authors (2):

Aly Saleh

Murat Karslioglu

View More author details

Table of Contents (12) Chapters

Preface

1. Chapter 1: Introduction to Kubernetes Infrastructure and Production-Readiness

2. Chapter 2: Architecting Production-Grade Kubernetes Infrastructure

3. Chapter 3: Provisioning Kubernetes Clusters Using AWS and Terraform

4. Chapter 4: Managing Cluster Configuration with Ansible

5. Chapter 5: Configuring and Enhancing Kubernetes Networking Services

6. Chapter 6: Securing Kubernetes Effectively

7. Chapter 7: Managing Storage and Stateful Applications

8. Chapter 8: Deploying Seamless and Reliable Applications

9. Chapter 9: Monitoring, Logging, and Observability

10. Chapter 10: Operating and Maintaining Efficient Kubernetes Clusters

11. Other Books You May Enjoy

Chapter 7: Managing Storage and Stateful Applications

In the previous chapters, we learned how to provision and prepare our Kubernetes clusters for production workloads. It is part of the critical production readiness requirement to configure and fine-tune day zero tasks, including networking, security, monitoring, logging, observability, and scaling, before we bring our applications and data to Kubernetes. Kubernetes was originally designed for mainly stateless applications in order to keep containers portable. Therefore, data management and running stateful applications are still among the top challenges in the cloud native space. There are a number of ways and a variety of solutions to address storage needs. New solutions emerge in the Kubernetes and cloud-native ecosystem every day; therefore, we will start with popular in-production solutions and also learn the approach and criteria to look for when evaluating future solutions.

In this chapter, we will learn the technical challenges...

Technical requirements

You should have the following tools installed from the previous chapters:

AWS CLI V2
AWS IAM Authenticator
kubectl

We will also need to install the following tools:

Helm
CSI driver

You need to have an up and running Kubernetes cluster as per the instructions in Chapter 3, Provisioning Kubernetes Clusters Using AWS and Terraform.

The code for this chapter is located at https://github.com/PacktPublishing/Kubernetes-Infrastructure-Best-Practices/tree/master/Chapter07.

Check out the following link to see the Code in Action video:

https://bit.ly/3jemcot

Installing the required tools

In this section, we will install the tools that we will use to provision applications using Helm charts and provide dynamically provisioned volumes to the stateful applications in your Kubernetes infrastructure during this chapter and the upcoming ones. As a cloud and Kubernetes learner, you may be familiar with these tools from...

Implementation principles

In Chapter 1, Introduction to Kubernetes Infrastructure and Production-Readiness, we learned about the infrastructure design principles that we will follow during the book. I would like to start this chapter by highlighting the notable principles that influenced the cloud-native data management suggestions and the technical decisions in this chapter:

Simplication: In this chapter, we will retain our commitment to the simplification principle. Unless you are operating in a multi-cloud environment, it is not necessary to introduce new tools and complicate operations. On public clouds, we will use the native storage data management technology stack provided, and which is supported by your managed service vendor. Many stateful applications today are designed to fail and provide built-in, high-availability functionality. We will identify different types of stateful applications and learn how to simply data paths and fine-tune for performance. We will also...

Understanding the challenges with stateful applications

Kubernetes was initially built for stateless applications in order to keep containers portable. Even when we run stateful applications, the applications themselves are actually very often stateless containers where the state is stored separately and mounted from a resource called Persistent Volume (PV). We will learn the different resource types used to maintain state and also keep some form of flexibility later in the Understanding storage primitives in Kubernetes section.

I would like to highlight the six notable stateful application challenges that we will try to address in this chapter:

Deployment challenges: Especially when running a mission-critical service in production, finding the ideal deployment method of a certain stateful application can be challenging to start with. Should we use a YAML file we found in a blog article, open source repository examples, Helm charts, or an operator? Your choice will have...

Tuning Kubernetes storage

At some point, we have all experienced and been frustrated by storage performance and the technical limitations of it. In this chapter, we will learn the fundamentals of Kubernetes storage, including storage primitives, creating static persistent volumes (PVs), and using storage classes to provision dynamic PVs to simplify management.

Understanding containerized stateful applications requires us to get into the cloud-native mindset. Although referred to as stateful, data used by pods is either accessed remotely or orchestrated and stored in Kubernetes as separate resources. Therefore, some flexibility is maintained to schedule applications across worker nodes and update when needed without losing the data. Before we get into the tuning, let's understand some of the basic storage primitives in Kubernetes.

Understanding storage primitives in Kubernetes

The beauty of Kubernetes is that every part of it is abstracted as an object that can be managed...

Choosing a persistent storage solution

Two of the biggest stateful application challenges in Kubernetes are storage orchestration and data management. There are an infinite number of solutions out there. First, we will explain the main storage attributes and topologies we need to consider when evaluating storage alternatives. Let's review the topologies used by the most common storage systems:

Centralized: Traditional, or also referred to as monolithic, storage systems are most often tightly coupled with a proprietary hardware and internal communication protocols. They are usually associated with scale-up models since it is difficult to scale-out tightly coupled components of the storage nodes.
Distributed: Distributed storage systems are more likely to be a software-defined solution and they may be architected to favor availability, consistency, durability, performance, or scalability. Usually, distributed systems scale out better than others to support many storage...

Deploying stateful applications

Kubernetes provides a number of controller APIs to manage the deployment of pods within a Kubernetes cluster. Initially designed for stateless applications, these controllers are used to group pods based on need. In this section, we will briefly learn the differences between the following Kubernetes objects – pods, ReplicaSets, deployments, and StatefulSets. In the event of a node failure, individual Pods will not be rescheduled on other nodes. Therefore, they should be avoided when running stateful workloads.

Deployments are used when managing pods, and ReplicaSets when we need to roll out changes to replica Pods. Both ReplicaSets and Deployments are used when provisioning stateless applications. To learn about Deployments, please check the official Kubernetes documentation here: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/.

StatefulSets are another controller that reached a General Availability (GA) milestone...

Summary

In this chapter, we learned the stateful application challenges and best practices to consider when choosing the best storage management solutions, both open source and commercial, and finally, the stateful application considerations when deploying them in production using Kubernetes' StatefulSet and deployment objects.

We deployed the AWS EBS CSI driver and OpenEBS. We also created a highly available replicated storage using OpenEBS and deployed our application on OpenEBS volumes.

We gained a solid understanding of Kubernetes storage in this chapter, but you should perform a detailed evaluation of your cluster storage requirements and take further action to deploy any extra tools and configurations that may be required, including your storage provider's CSI driver.

In the next chapter, we will learn in detail about seamless and reliable applications. We will also get to grips with containerization best practices to easily scale our applications.