Packt+ | Advance your knowledge in tech

You're reading from Kubernetes for Developers

Product type Book

Published in Apr 2018

Publisher Packt

ISBN-13 9781788834759

Pages 374 pages

Edition 1st Edition

Languages

JavaScript

Concepts

Containerization

Author (1):

Joseph Heck

Table of Contents (16) Chapters

Title Page

Packt Upsell

Contributors

Preface

1. Setting Up Kubernetes for Development

2. Packaging Your Code to Run in Kubernetes

3. Interacting with Your Code in Kubernetes

4. Declarative Infrastructure

5. Pod and Container Lifecycles

6. Background Processing in Kubernetes

7. Monitoring and Metrics

8. Logging and Tracing

9. Integration Testing

10. Troubleshooting Common Problems and Next Steps

1. Other Books You May Enjoy

Leave a review - let other readers know what you think

Index

Chapter 8. Logging and Tracing

When we first started with containers and Kubernetes, we showed how we could get the log output from any of our individual containers using the kubectl log command. As we scale the number of containers from which we want to get information, the ability to easily find the relevant logs becomes increasingly difficult. In the previous chapter, we looked at how to aggregate and collect metrics, and in this chapter we extend that same concept, looking at how to aggregate logging and getting a better understanding of how containers work together with distributed tracing.

Topics for this chapter include:

A Kubernetes concept—DaemonSet
Installing Elasticsearch, Fluentd, and Kibana
Viewing logs using Kibana
Distributed tracing with Jeager
An example of adding tracing to your application

A Kubernetes concept – DaemonSet

A Kubernetes resource that we have now used (through Helm) is DaemonSet. This resource is a wrapper around pods very similar to ReplicaSet, but with the purpose of running a pod on every node in a cluster. When we installed Prometheus using Helm, it created a DaemonSet to run node-collector on each node within the Kubernetes cluster.

There are two common patterns for running software in a support role with your application: the first is using the side-car pattern, and the second is using a DaemonSet. A side-car is when you include a container within your pod whose sole purpose is to run alongside the primary application and provide some supporting, but external, role. An example of a useful side-car might be a cache, or a proxy service of some form. Running a side-car application obviously increases the resources needed for a pod, and if the number of pods is relatively low or they are sparse compared to the size of a cluster, this would be the most efficient...

Installing and using Elasticsearch, Fluentd, and Kibana

Fluentd is software that's frequently used to collect and aggregate logging. Hosted at https://www.fluentd.org, like prometheus it is open source software that is managed under the umbrella of the Cloud Native Computing Foundation (CNCF). When it comes to talking about aggregating logs, the problem has existed long before containers, and ELK was a frequent acronym used to represent a solution, the combination of Elasticsearch, Logstash, and Kibana. When using containers, the number of log sources expands, making the problem of collecting all the logs even larger, and Fluentd evolved to support the same space as Logstash, focusing on structured logging with a JSON format, routing it, and supporting plugins to process the logs. Fluentd was written in Ruby and C, intending to be faster and more efficient than LogStash, and the same pattern is continuing with Fluent Bit (http://fluentbit.io), which has an even smaller memory footprint....

Viewing logs using Kibana

For this book, we will explore how to use Kibana, taking advantage of it as an add-on to Minikube. After you have enabled it, and when the pods are fully available and reporting as Ready, you can access Kibana with this command:

minikube service kibana-logging -n kube-system

This will bring up a web page that is backed by the kibana-logging service. When it is first accessed, the web page will ask you to specify a default index, which is used by Elasticsearch to build its search indices:

Click on Create, taking the defaults that are provided. The default index pattern of logstash-* doesn't mean it has to come from logstash as a source, and the data that has already been sent to ElasticSearch from Fluentd will all be directly accessible.

One you have defined a default index, the next page that is displayed will show you all the fields that have been added into Elasticsearch as Fluentd has taken the data from the container logs and Kubernetes metadata:

You can browse through...

Distributed tracing with Jaeger

As you decompose your services into multiple containers, one of the hardest things to understand is the flow and path of requests, and how containers are interacting. As you expand and use more containers to support components within your system, knowing which containers are which and how they're contributing to the performance of a request becomes a significant challenge. For simple systems, you can often add logging and get a view through the log files. As you move into dozens, or even hundreds, of different containers making up a service, that process becomes far less tenable.

One solution to this problem is called Distributed Tracing, which is a means of tracking the path of requests between containers, much like a profiler can track requests within a single application. This involves using libraries or frameworks that support a tracing library to create and pass along the information, as well as a system external to your application to collect this information...

Example – adding tracing to your application

There are several things we will need to do to enable tracing from our example applications:

Add the libraries and code to generate traces
Add a tracing collector side-car to your pod

Let's look at enabling the tracing side-car first, and we will use the Python Flask example that we have been building earlier in the book.

The code for this example is online at the GitHub project at https://github.com/kubernetes-for-developers/kfd-flask, and the branch for this addition is 0.6.0. You can get the code for this project locally using the following commands:

git clone https://github.com/kubernetes-for-developers/kfd-flask -b 0.6.0

Adding a tracing collector to your pod

The libraries that implement open-tracing typically use a very lightweight network connection, in this case UDP, to send traces from our code. UDP does not guarantee connections, so this also means that trace information could be lost if the network became too congested. OpenTracing and Jaeger...

Summary

In this chapter, we introduced logging and tracing with Fluentd and Jaeger. We showed how to deploy it and use it, capturing and aggregating data from your code when it runs at scale. We walked through how to use an Elasticsearch query to find data. We also looked at how to view Jaeger traces and how to add tracing to your code.

In the next chapter, we will look at ways of using Kubernetes to support and run integration testing, as well as using it with continuous integration.