You're reading from The DevOps 2.5 Toolkit

Product typeBook

Published inNov 2019

PublisherPackt

ISBN-139781838647513

Edition1st Edition

Tools

Kubernetes

Concepts

DevOps

Author (1)

Viktor Farcic

Collecting and Querying Logs

In critical moments, men sometimes see exactly what they wish to see.

- Spock

So far, our primary focus was on metrics. We used them in different forms and for different purposes. In some cases, we used metrics to scale Pods and nodes. In others, metrics were used to create alerts that would notify us when there is an issue that cannot be fixed automatically. We also created a few dashboards.

However, metrics are often not enough. That is especially true when dealing with issues that require manual interventions. When metrics alone are insufficient, we usually need to consult logs hoping that they will reveal the cause of the problem.

Logging is often misunderstood or, to be more precise, mixed with metrics. For many, the line between logs and metrics is blurred. Some are extracting metrics from logs. Others are treating metrics and logs as the same...

Creating a cluster

You know the drill. We'll move into the directory with the vfarcic/k8s-specs (https://github.com/vfarcic/k8s-specs) repository, we'll pull the latest version of the code just in case I pushed something recently, and we'll create a new cluster unless you already have one at hand.

All the commands from this chapter are available in the 07-logging.sh (https://gist.github.com/vfarcic/74774240545e638b6cf0e01460894f34) Gist.

 1  cd k8s-specs
 2
 3  git pull

This time, the requirements for the cluster changed. We need much more memory than before. The main culprit is ElasticSearch which is very resource hungry.

If you're using Docker for Desktop or minikube, you'll need to increase the memory dedicated to the cluster to 10 GB. If that's too much for your laptop, you might choose to read the Exploring Centralized Logging Through Elasticsearch...

Exploring logs through kubectl

The first contact most people have with logs in Kubernetes is through kubectl. It is almost unavoidable not to use it.

As we're learning how to tame the Kubernetes beast, we are bound to check logs when we get stuck. In Kubernetes, the term "logs" is reserved for the output produced by our and third-party applications running inside a cluster. However, those exclude the events generated by different Kubernetes resources. Even though many would call them logs as well, Kubernetes separates them from logs and calls them events. I'm sure that you already know how to retrieve logs from the applications and how to see Kubernetes events. Nevertheless, we'll explore them briefly here as well since that will add relevance to the discussion we'll have later on. I promise to keep it short, and you are free to skip this section...

Choosing a centralized logging solution

The first thing we need to do is to find a place where we'll store logs. Given that we want to have the ability to filter log entries, storing them in files should be discarded from the start. What we need is a database, of sorts. It is more important that it is fast than transactional, so we are most likely looking into a solution that is an in-memory database. But, before we take a look at the choices, we should discuss the location of our database. Should we run it inside our cluster, or should we use a service? Instead of making that decision right away, we'll explore both options, before we make a choice.

There are two major groups of logging-as-a-service types. If we are running our cluster with one of the Cloud providers, an obvious choice might be to use a logging solution they provide. EKS has AWS CloudWatch, GKE has GCP...

Exploring logs collection and shipping

For a long time now, there are two major contestants for the "logs collection and shipping" throne. Those are Logstash (https://www.elastic.co/products/logstash) and Fluentd (https://www.fluentd.org/). Both are open source, and both are widely accepted and actively maintained. While both have their pros and cons, Fluentd turned up to have an edge with cloud-native distributed systems. It consumes fewer resources and, more importantly, it is not tied to a single destination (Elasticsearch). While Logstash can push logs to many different targets, it is primarily designed to work with Elasticsearch. For that reason, other logging solutions adopted Fluentd.

As of today, no matter which logging product you embrace, the chances are that it will support Fluentd. The culmination of that adoption can be seen by Fluentd's entry into...

Exploring centralized logging through Papertrail

The first centralized logging solution we'll explore is Papertrail (https://papertrailapp.com/). We'll use it as a representative of a logging-as-a-service solution that can save us from installing and, more importantly, maintaining a self-hosted alternative.

Papertrail features live trailing, filtering by timestamps, powerful search queries, pretty colors, and quite a few other things that might (or might not) be essential when skimming through logs produced inside our clusters.

The first thing we need to do is to register or, if this is not the first time you tried Papertrail, to log in.

 1  open "https://papertrailapp.com/"

Please follow the instructions to register or to log in if you already have a user in their system.

You will be glad to find out that Papertrail provides a free plan that allows storage...

Combining GCP Stackdriver with a GKE cluster

If you're using GKE cluster, logging is already set up, even though you might not know about it. By default, every GKE cluster comes by default with a Fluentd DaemonSet that is configured to forward logs to GCP Stackdriver. It is running in the kube-system Namespace.

Let's describe GKE's Fluentd DaemonSet and see whether there is any useful information we might find.

 1  kubectl -n kube-system \
 2    describe ds -l k8s-app=fluentd-gcp

The output, limited to the relevant parts, is as follows.

...
Pod Template:
  Labels:     k8s-app=fluentd-gcp
              kubernetes.io/cluster-service=true
              version=v3.1.0
...
  Containers:
   fluentd-gcp:
    Image: gcr.io/stackdriver-agents/stackdriver-logging-agent:0.3-1.5.34-1-k8s-1
    ...

We can see that, among others, the DaemonSet's Pod Template has the label...

Combining AWS CloudWatch with an EKS cluster

Unlike GKE that has a logging solution baked into a cluster, EKS requires us to set up a solution. It does provide CloudWatch service, but we need to ensure that the logs are shipped there from our cluster.

Just as before, we'll use Fluentd to collect logs and ship them to CloudWatch. Or, to be more precise, we'll use a Fluentd tag built specifically for CloudWatch. As you probably already know, we'll also need an IAM policy that will allow Fluentd to communicate with CloudWatch.

All in all, the setup we are about to make will be very similar to the one we did with Papertrail, except that we'll store the logs in CloudWatch, and that we'll have to put some effort into creating AWS permissions.

Before we proceed, I'll assume that you still have the environment variables AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY...

Combining Azure Log Analytics with an AKS cluster

Just like GKE (and unlike EKS), AKS comes with an integrated logging solution. All we have to do is enable one of the AKS addons. To be more precise, we'll enable the monitoring addon. As the name indicates, the addon does not fulfill only the needs to collect logs, but it also handles metrics. However, we are interested just in logs. I believe that nothing beats Prometheus for metrics, especially since it integrates with HorizontalPodAutoscaler. Still, you should explore AKS metrics as well and reach your own conclusion. For now, we'll explore only the logging part of the addon.

 1  az aks enable-addons \
 2    -a monitoring \
 3    -n devops25-cluster \
 4    -g devops25-group

The output is a rather big JSON with all the information about the newly enabled monitoring addon. There's nothing exciting in it.

It&apos...

Exploring centralized logging through Elasticsearch, Fluentd, and Kibana

Elasticsearch is probably the most commonly used in-memory database. At least, if we narrow the scope to self-hosted databases. It is designed for many other scenarios, and it can be used to store (almost) any type of data. As such, it is almost perfect for storing logs which could come in many different formats. Given its flexibility, some use it for metrics as well and, as such, Elasticsearch competes with Prometheus. We'll leave metrics aside, for now, and focus only on logs.

The EFK (Elasticsearch, Fluentd, and Kibana) stack consists of three components. Data is stored in Elasticsearch, logs are collected, transformed, and pushed to the DB by Fluentd, and Kibana is used as UI through which we can explore data stored in Elasticsearch. If you are used to ELK (Logstash instead of Fluentd), the setup...

Switching to Elasticsearch for storing metrics

Now that we had Elasticsearch running in our cluster and knowing that it can handle almost any data type, a logical question could be whether we can use it to store our metrics besides logs. If you explore elastic.co (https://www.elastic.co/), you'll see that metrics are indeed something they advertise. If it could replace Prometheus, it would undoubtedly be beneficial to have a single tool that can handle not only logs but also metrics. On top of that, we could ditch Grafana and keep Kibana as a single UI for both data types.

Nevertheless, I would strongly advise against using Elasticsearch for metrics. It is a general-purpose free-text no-SQL database. That means that it can handle almost any data but, at the same time, it does not excel at any specific format. Prometheus, on the other hand, is designed to store time-series...

What should we expect from centralized logging?

We explored several products that can be used to centralize logging. As you saw, all are very similar, and we can assume that most of the other solutions follow the same principles. We need to collect logs across the cluster. We used Fluentd for that, which is the most widely accepted solution that you will likely use no matter which database receives those logs (Azure being an exception).

Log entries collected with Fluentd are shipped to a database which, in our case, is Papertrail, Elasticsearch, or one of the solutions provided by hosting vendors. Finally, all solutions offer a UI that allows us to explore the logs.

I usually provide a single solution for a problem but, in this case, there are quite a few candidates for your need for centralized logging. Which one should you choose? Will it be Papertrail, Elasticsearch-Fluentd...

What now?

You know what to do. Destroy the cluster if you created it specifically for this chapter.

Before you leave, you might want to go over the main points of this chapter.

For any but the smallest systems, going from one resource to another and from one node to another to find the cause of an issue is anything but practical, reliable, and fast.
More often than not, kubectl logs command does not provide us with enough options to perform anything but simplest retrieval of logs.
Elasticsearch is excellent, but it does too much. Its lack of focus makes it inferior to Prometheus for storing and querying metrics, as well as sending alerts based on such data.
Logs themselves are too expensive to parse, and most of the time they do not provide enough data to act as metrics.
We need logs centralized in a single location so that we can explore logs from any part of the system.
We...

The rest of the chapter is locked

You have been reading a chapter from

The DevOps 2.5 Toolkit

Published in: Nov 2019Publisher: PacktISBN-13: 9781838647513

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €14.99/month. Cancel anytime

Author (1)

Viktor Farcic

Viktor Farcic is a senior consultant at CloudBees, a member of the Docker Captains group, and an author. He codes using a plethora of languages starting with Pascal (yes, he is old), Basic (before it got the Visual prefix), ASP (before it got the .NET suffix), C, C++, Perl, Python, ASP.NET, Visual Basic, C#, JavaScript, Java, Scala, and so on. He never worked with Fortran. His current favorite is Go. Viktor's big passions are Microservices, Continuous Deployment, and Test-Driven Development (TDD). He often speaks at community gatherings and conferences. Viktor wrote Test-Driven Java Development by Packt Publishing, and The DevOps 2.0 Toolkit. His random thoughts and tutorials can be found in his blog—Technology Conversations
Read more about Viktor Farcic

Other recommended products

Related to this chapter

The DevOps 2.4 Toolkit

The DevOps Toolkit 2.4 is a deep exploration of continuous delivery and deployment in Kubernetes using Jenkins. It shows readers how to build, test, and deploy applications in Kubernetes using fully automated Jenkins pipelines.

BookNov 2019398 pages

The DevOps 2.2 Toolkit

Viktor Farcic’s latest book, The DevOps 2.2 Toolkit: Self-Sufficient Docker Clusters, takes you deeper into Docker, teaching you with a mixture of theory and hands-on how to successfully build both self-adaptive and self-healing-based systems.

BookMar 2018360 pages

kubectl: Command-Line Kubernetes in a Nutshell

Kubernetes is a de facto container orchestration system. To manage Kubernetes, you must understand how to work with kubectl, its command-line tool that lets you interact with your clusters to deploy applications and manage their lifecycle. This book is a comprehensive introduction for those who are new to Kubernetes management via the command line.

BookNov 2020136 pages

The DevOps 2.3 Toolkit

Viktor Farcic’s latest book, The DevOps 2.3 Toolkit: Kubernetes, will take you on a hands-on journey with Viktor into the world of Kubernetes, and the tools not only behind the official project but also the wide-range of third-party apps that are available for you to use.

BookSep 2018418 pages

Kubernetes Cookbook

Kubernetes is one of the most popular, sophisticated, and fast-evolving container orchestrators. In this book, you’ll learn the essentials and find out about the advanced administration in Kubernetes. We’ll take you through a step-by-step hands-on approach, which will familiarize you with the Kubernetes ecosystem.

BookMay 2018554 pages

Kubernetes - A Complete DevOps Cookbook

Kubernetes is one of the most popular, sophisticated, and fast-evolving container orchestrators. In this book, you’ll learn the essentials and find out about the advanced administration and orchestration techniques in Kubernetes. Readers will also learn to manage containers using the latest version of Kubernetes with a recipe-based approach.

BookMar 2020584 pages

The DevOps 2.1 Toolkit: Docker Swarm

Viktor Farcic's latest book, The DevOps 2.1 Toolkit: Docker Swarm, takes you deeper into one of the major subjects of his international best seller, The DevOps 2.0 Toolkit, and shows you how to successfully integrate Docker Swarm into your DevOps toolset.

BookMay 2017436 pages

Kubernetes on AWS

Docker containers promise to radicalize the way developers build, deploy, and manage applications running in the cloud. Kubernetes provides the orchestration tools needed to realize that promise in production. In this book, you will learn to deploy a production-ready Kubernetes cluster on the AWS platform and also discover the power of Kubernetes.

BookNov 2018270 pages

Hands-On Kubernetes on Windows

Starting with release 1.14, Kubernetes has production support for Windows Containers. This book is designed to help developers, architects and DevOps engineers working in the Windows ecosystem to deploy and orchestrate cloud applications using Windows Containers on Kubernetes.

BookMar 2020592 pages

DevOps with Kubernetes

This book will guide you from container basic concepts to orchestrating containerized applications in Kubernetes. You’ll learn about the Kubernetes basic architecture, components, resources, admission control, and extensions. We’ll show you how to utilize Kubernetes services in three top cloud providers.

BookJan 2019484 pages

Hands-On Infrastructure Monitoring with Prometheus

Prometheus is an open-source monitoring system with a dimensional data model, flexible query language, efficient time series database and modern alerting approach. This book will be your practical guide to setup Prometheus on your cloud, virtual machine, container, and server ecosystem.

BookMay 2019442 pages2

Cloud Native with Kubernetes

Cloud Native with Kubernetes will guide you effectively - from your first steps using Kubernetes right up to running enterprise-grade cloud native applications with best practices. The book covers every aspect of deploying, securing, and operating modern cloud native applications on Kubernetes.

BookJan 2021446 pages

Personalised recommendations for you

Based on your interests and search pattern

Designing and Implementing Microsoft Azure Networking Solutions

Designing and Implementing Microsoft Azure Networking Solutions Exam Ref AZ-700 is an all-encompassing guide to the AZ-700 exam and contains all the information you need to succeed in the world of virtual networking with Azure. With this book, you will be fully prepared for the exam and the world of cloud networking.

BookAug 2023524 pages

Microsoft 365 Security, Compliance, and Identity Administration

The Microsoft 365 Security, Compliance, and Identity Administration is a comprehensive guide that helps you employ Microsoft 365's robust suite of features and empowers you to optimize your administrative tasks.

BookAug 2023630 pages

Zero Trust Overview and Playbook Introduction

Get started on Zero Trust with this step-by-step playbook and learn everything you need to know for a successful Zero Trust journey with tailored guidance for every role, covering strategy, operations, architecture, implementation, and measuring success. This book will become an indispensable reference for everyone in your organization.

BookOct 2023240 pages

The Self-Taught Cloud Computing Engineer

This self-study book helps you master multiple clouds, including AWS, Azure, and GCP, and serves as a roadmap to becoming a certified cloud computing expert. The book will guide you to develop a professional cloud career by helping you build a broad cloud knowledge base, developing hands-on cloud computing skills, and getting cloud certified.

BookSep 2023472 pages

Technology Operating Models for Cloud and Edge

This book will help you build and create ownership of a technology operating model, as well as connect your leadership with engineering and operations, keeping your internal and external customers in mind. It provides practical tips on why, where, and how to make the cloud and edge platform paradigm sing for you, your team, and your organization.

BookAug 2023228 pages

Azure Architecture Explained

Azure is the preferred platform to build mission-critical and secure apps. This book provides comprehensive coverage of essential Azure products, services, and solutions vital for every solution architect's success. Elevate your knowledge and master the critical components of Azure to excel in your role with Azure Architecture Explained.

BookSep 2023446 pages

Pentesting Active Directory and Windows-based Infrastructure

This practical guide helps you explore the pentesting of Microsoft infrastructure in detail, and enhances your offensive skillset by showing you the different ways to perform security assessment. This book will help blue teamers and IT engineers get up to speed with possible security issues they may encounter in their Windows environments.

BookNov 2023360 pages

Practical Ansible

In Practical Ansible, you'll work with the latest release of Ansible and learn to solve complex issues quickly with the help of task-oriented scenarios. You'll start by installing and configuring Ansible to automate monotonous and repetitive IT tasks and get to grips with concepts such as playbooks, inventories, plugins, collections, and network modules.

BookSep 2023420 pages

Windows 11 for Enterprise Administrators

Microsoft’s launch of Windows 11 is a step toward satisfying the enterprise administrator’s needs for better management and enhanced user experience customization. This book provides the enterprise administrator with the knowledge needed to fully utilize the advanced feature set of Windows 11 Enterprise.

BookOct 2023286 pages

The Linux DevOps Handbook

This book is for software and IT professionals seeking knowledge on Linux systems and DevOps practices. This book will provide you with guidance and tools to learn and gain proficiency in managing Linux-based infrastructures and knowledge of DevOps.

BookNov 2023428 pages2