Reader small image

You're reading from  Hands-On Infrastructure Monitoring with Prometheus

Product typeBook
Published inMay 2019
PublisherPackt
ISBN-139781789612349
Edition1st Edition
Right arrow
Authors (2):
Joel Bastos
Joel Bastos
author image
Joel Bastos

Joel Bastos is an open source supporter and contributor, with a background in infrastructure security and automation. He is always striving for the standardization of processes, code maintainability, and code reusability. He has defined, led, and implemented critical, highly available, and fault-tolerant enterprise and web-scale infrastructures in several organizations, with Prometheus as the cornerstone. He has worked at two unicorn companies in Portugal and at one of the largest transaction-oriented gaming companies in the world. Previously, he has supported several governmental entities with projects such as the Public Key Infrastructure for the Portuguese citizen card. You can find his blogs at kintoandar and on Twitter with the handle @kintoandar.
Read more about Joel Bastos

Pedro Araújo
Pedro Araújo
author image
Pedro Araújo

Pedro Arajo is a site reliability and automation engineer and has defined and implemented several standards for monitoring at scale. His contributions have been fundamental in connecting development teams to infrastructure. He is highly knowledgeable about infrastructure, but his passion is in the automation and management of large-scale, highly-transactional systems. Pedro has contributed to several open source projects, such as Riemann, OpenTSDB, Sensu, Prometheus, and Thanos. You can find him on Twitter with the handle @phcrva.
Read more about Pedro Araújo

View More author details
Right arrow

Exporters and Integrations

Even though first-party exporters cover the basics pretty well, the Prometheus ecosystem provides a wide variety of third-party exporters that cover everything else. In this chapter, we will be introduced to some of the most useful exporters available—from operating system (OS) metrics and Internet Control Message Protocol (ICMP) probing to generating metrics from logs, or how to collect information from short-lived processes, such as batch jobs.

In brief, the following topics will be covered in this chapter:

  • Test environments for this chapter
  • Operating system exporter
  • Container exporter
  • From logs to metrics
  • Blackbox monitoring
  • Pushing metrics
  • More exporters

Test environments for this chapter

In this chapter, we'll be using two test environments: one based on virtual machines (VMs) that mimic traditional static infrastructure and one based on Kubernetes for modern workflows. The following topics will guide you through the automated setup procedure for both of them, but will gloss over the details from each exporter—these will be explained in depth in their own sections.

Static infrastructure test environment

This method will abstract all the deployment and configuration details, allowing you to have a fully provisioned test environment with a couple of commands. You'll still be able to connect to each of the guest instances and tinker with the example configurations...

Operating system exporter

When monitoring infrastructure, the most common place to start looking is at the OS level. Metrics for resources such as CPU, memory, and storage devices, as well as kernel operating counters and statistics provide valuable insight to assess a system's performance characteristics. For a Prometheus server to collect these types of metrics, an OS-level exporter is needed on the target hosts to expose them in an HTTP endpoint. The Prometheus project provides such an exporter that supports Unix-like systems called the Node Exporter, and the community also maintains an equivalent exporter for Microsoft Windows systems called the WMI exporter.

The Node Exporter

The Node Exporter is the most well-known...

Container exporter

In the constant pursuit for workload isolation and resource optimization, we witnessed the move from physical to virtualized machines using hypervisors. Using virtualization implies a certain degree of resource usage inefficiency, as the storage, CPU, and memory need to be allocated to each running VM whether it uses them or not. A lot of work has been done in this area to mitigate such inefficiencies but, in the end, fully taking advantage of system resources is still a difficult problem.

With the rise of operating-system-level virtualization on Linux (that is, the use of containers), the mindset changed. We no longer want a full copy of an OS for each workload, but instead, only properly isolated processes to do the desired work. To achieve this, and focusing specifically on Linux containers, a set of kernel features responsible for isolating hardware resources...

From logs to metrics

In a perfect world, all applications and services would have been properly instrumented and we would only be required to collect metrics to gain visibility. External exporters are a stop-gap approach that simplifies our work, but not every service exposes its internal state through a neat API. Older daemon software, such as Postfix or ntpd, makes use of logging to relay their inner workings. For these cases, we're left with two options: either instrument the service ourselves (which isn't possible for closed source software) or rely on logs to gather the metrics we require. The next topics go over the available options for extracting metrics from logs.

mtail

Developed by Google, mtail is a very...

Blackbox monitoring

Introspection is invaluable to gather data about a system, but sometimes we're required to measure from the point of view of a user of that system. In such cases, probing is a good option to simulate user interaction. As probing is made from the outside and without knowledge regarding the inner workings of the system, this is classified as blackbox monitoring, as discussed in Chapter 1, Monitoring Fundamentals.

Blackbox exporter

blackbox_exporter is one of the most peculiar of all the currently available exporters in the Prometheus ecosystem. Its usage pattern is ingenious and usually, newcomers are puzzled by it. We'll be going to dive into this exporter with the hope of making its use as straightforward...

Pushing metrics

Despite the intense debate regarding push versus pull and the deliberate decision of using pull in the Prometheus server design, there are some legitimate situations where push is more appropriate.

One of those situations is batch jobs, though, for this statement to truly make sense, we need to clearly define what is considered a batch job. In this scope, a service-level batch job is a processing workload not tied to a particular instance, executed infrequently or on a schedule, and as such is not always running. This kind of job makes it very hard to generate successful scrapes if instrumented, which, as discussed previously in Chapter 5, Running a Prometheus Server, results in metric staleness, even if running for long enough to be scraped occasionally.

There are alternatives to relying on pushing metrics; for example, by using the textfile collector from node_exporter...

More exporters

The Prometheus community has produced a great number of exporters for just about anything you might need. However, making an intentional choice to deploy a new piece of software in your infrastructure has an indirect price to pay upfront. That price translates into the deployment automation code to be written, the packaging, the metrics to be collected and alerting to be created, the logging configuration, the security concerns, the upgrades, and other things we sometimes take for granted. When choosing an open source exporter, or any other open source project for that matter, there are a few indicators to keep in mind.

We should validate the community behind the project, the general health of contributions, if issues are being addressed, pull requests are being timely managed, and whether the maintainers are open to discuss and interact with the community. Technically...

Summary

In this chapter, we had the opportunity to discover some of the most used Prometheus exporters available. Using test environments, we were able to interact with operating-system-level exporters running on VMs and container-specific exporters running on Kubernetes. We found that sometimes we need to rely on logs to obtain metrics and went through the current best options to achieve this. Then, we explored blackbox probing with the help of blackbox_exporter and validated its unique workflow. We also experimented with pushing metrics instead of using the standard pull approach from Prometheus, while making clear why sometimes this method does indeed make sense.

All these exporters enable you to gain visibility without having to natively instrument code, which sometimes is much more costly than relying on community-driven exporters.

With so many sources of metrics, now is...

Questions

  1. How would you collect custom metrics with the Node Exporter?
  2. What resources does cAdvisor consult to generate metrics?
  3. kube-state-metrics expose numerous API objects. Is there a way to restrict that number?
  4. How could you debug a blackbox_exporter probe?
  5. If an application does not expose metrics, in Prometheus format or otherwise, what could an option to monitor it be?
  6. What are the downsides of using Pushgateway?
  7. If a particular batch job is host specific, is there any alternative to the use of Pushgateway?
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Infrastructure Monitoring with Prometheus
Published in: May 2019Publisher: PacktISBN-13: 9781789612349
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Authors (2)

author image
Joel Bastos

Joel Bastos is an open source supporter and contributor, with a background in infrastructure security and automation. He is always striving for the standardization of processes, code maintainability, and code reusability. He has defined, led, and implemented critical, highly available, and fault-tolerant enterprise and web-scale infrastructures in several organizations, with Prometheus as the cornerstone. He has worked at two unicorn companies in Portugal and at one of the largest transaction-oriented gaming companies in the world. Previously, he has supported several governmental entities with projects such as the Public Key Infrastructure for the Portuguese citizen card. You can find his blogs at kintoandar and on Twitter with the handle @kintoandar.
Read more about Joel Bastos

author image
Pedro Araújo

Pedro Arajo is a site reliability and automation engineer and has defined and implemented several standards for monitoring at scale. His contributions have been fundamental in connecting development teams to infrastructure. He is highly knowledgeable about infrastructure, but his passion is in the automation and management of large-scale, highly-transactional systems. Pedro has contributed to several open source projects, such as Riemann, OpenTSDB, Sensu, Prometheus, and Thanos. You can find him on Twitter with the handle @phcrva.
Read more about Pedro Araújo