You're reading from The Kubernetes Workshop

Product type Book

Published in Sep 2020

Publisher Packt

ISBN-13 9781838820756

Pages 780 pages

Edition 1st Edition

Languages

Concepts

Containerization

Authors (6):

Zachary Arnold

Sahil Dua

Wei Huang

Faisal Masood

Mélony Qin

Mohammed Abu Taleb

View More author details

Table of Contents (20) Chapters

Preface

1. Introduction to Kubernetes and Containers

2. An Overview of Kubernetes

3. kubectl – Kubernetes Command Center

4. How to Communicate with Kubernetes (API Server)

5. Pods

6. Labels and Annotations

7. Kubernetes Controllers

8. Service Discovery

9. Storing and Reading Data on Disk

10. ConfigMaps and Secrets

11. Build Your Own HA Cluster

12. Your Application and HA

13. Runtime and Network Security in Kubernetes

14. Running Stateful Components in Kubernetes

15. Monitoring and Autoscaling in Kubernetes

16. Kubernetes Admission Controllers

17. Advanced Scheduling in Kubernetes

18. Upgrading Your Cluster without Downtime

19. Custom Resource Definitions in Kubernetes

15. Monitoring and Autoscaling in Kubernetes

Overview

This chapter will introduce you to how Kubernetes enables you to monitor your cluster and workloads, and then use the data collected to automatically drive certain decisions. You will learn about the Kubernetes Metric Server, which aggregates all cluster runtime information, allowing you to use this information to drive application runtime scaling decisions. We will walk you through setting up monitoring using the Kubernetes Metrics server and Prometheus and then use Grafana to visualize those metrics. By the end of this chapter, you will also have learned how to automatically scale up your application to completely utilize the resources on the provisioned infrastructure, as well as automatically scale your cluster infrastructure as needed.

Introduction

Let's take a moment to reflect on our progress through this series of chapters beginning from Chapter 11, Build Your Own HA Cluster. We started by setting up a Kubernetes cluster using kops to configure AWS infrastructure in a highly available manner. Then, we used Terraform and some scripting to improve the stability of our cluster and deploy our simple counter app. After this, we began hardening the security and increasing the availability of our app using Kubernetes/cloud-native principles. Finally, we learned how to run a stateful database responsible for using transactions to ensure that we always get a series of increasing numbers from our application.

In this chapter, we are going to explore how to leverage the data that already exists in Kubernetes about our applications to drive and automate decision-making processes around scaling them so that they are always the right size for our load. Because it takes time to observe application metrics, schedule...

Kubernetes Monitoring

Kubernetes has built-in support for providing useful monitoring information about infrastructure components as well as various Kubernetes objects. The Kubernetes Metrics server is a component (which does not come built-in) that gathers and exposes the metrics data at an API endpoint on the API server. Kubernetes uses this data to manage the scaling of Pods, but this data can also be scraped by a third-party tool such as Prometheus for use by cluster operators. Prometheus has a few very basic data visualization functions and primarily serves as a metric-gathering and storage tool, so you can use a more powerful and useful data visualization tool such as Grafana. Grafana allows cluster admins to create useful dashboards to monitor their clusters. You can learn more about how monitoring in Kubernetes is architected at this link: https://github.com/kubernetes/community/blob/master/contributors/design-proposals/instrumentation/monitoring_architecture.md.

Here&apos...

Autoscaling in Kubernetes

Kubernetes allows you to automatically scale your workloads to adapt to changing demands on your applications. The information gathered from the Kubernetes Metrics server is the data that is used for driving the scaling decisions. In this book, we will be covering two types of scaling action—one that impacts the number of running pods in a Deployment and another that impacts the number of running nodes in a cluster. Both are examples of horizontal scaling. Let's briefly gain an intuition for what both the horizontal scaling of pods and the horizontal scaling of nodes would entail:

Pods: Assuming t hat you filled out the resources: section of podTemplate when creating a Deployment in Kubernetes, each container within that pod will have the requests and limits fields, as designated by the corresponding cpu and memory fields. When the resources needed to process a workload exceed that which you have allocated, then by adding additional replicas...

Summary

Let's reflect a bit on how far we've come from Chapter 11, Build Your Own HA Cluster, when we started to talk about running Kubernetes in a highly available manner. We covered how to set up a production cluster that was secure in the cloud and created using infrastructure as code tools such as Terraform, as well as secured the workloads that it runs. We also looked at necessary modifications to our applications in order to scale them well—both for the stateful and stateless versions of the application.

Then, in this chapter, we looked at how we can extend the management of our application runtimes using data specifically when introducing Prometheus, Grafana, and the Kubernetes Metrics server. We then used that information to leverage the HPA and the ClusterAutoscaler so that we can rest assured that our cluster is always appropriately sized and ready to respond to spikes in demand automatically without having to pay for hardware that is overprovisioned...