Reader small image

You're reading from  Hands-On Infrastructure Monitoring with Prometheus

Product typeBook
Published inMay 2019
PublisherPackt
ISBN-139781789612349
Edition1st Edition
Right arrow
Authors (2):
Joel Bastos
Joel Bastos
author image
Joel Bastos

Joel Bastos is an open source supporter and contributor, with a background in infrastructure security and automation. He is always striving for the standardization of processes, code maintainability, and code reusability. He has defined, led, and implemented critical, highly available, and fault-tolerant enterprise and web-scale infrastructures in several organizations, with Prometheus as the cornerstone. He has worked at two unicorn companies in Portugal and at one of the largest transaction-oriented gaming companies in the world. Previously, he has supported several governmental entities with projects such as the Public Key Infrastructure for the Portuguese citizen card. You can find his blogs at kintoandar and on Twitter with the handle @kintoandar.
Read more about Joel Bastos

Pedro Araújo
Pedro Araújo
author image
Pedro Araújo

Pedro Arajo is a site reliability and automation engineer and has defined and implemented several standards for monitoring at scale. His contributions have been fundamental in connecting development teams to infrastructure. He is highly knowledgeable about infrastructure, but his passion is in the automation and management of large-scale, highly-transactional systems. Pedro has contributed to several open source projects, such as Riemann, OpenTSDB, Sensu, Prometheus, and Thanos. You can find him on Twitter with the handle @phcrva.
Read more about Pedro Araújo

View More author details
Right arrow

An Overview of the Prometheus Ecosystem

With such a vast collection of components available for use, it can be daunting to choose the ones that are required to solve a given monitoring gap. In this chapter, we will go over the Prometheus ecosystem, which components perform what job, and understand how everything works together logically.

Striving for simplicity and having a clear understanding of all the moving parts of a Prometheus stack is invaluable to keep things manageable and reliable.

In brief, the following topics will be covered in this chapter:

  • Metrics collection with Prometheus
  • Exposing internal state with exporters
  • Alert routing and management with Alertmanager
  • Visualizing your data

Metrics collection with Prometheus

Prometheus is a time series-based, open source monitoring system. It collects data by sending HTTP requests to hosts and services on metrics endpoints, which it then makes available for analysis and alerting using a powerful query language.

Even though Prometheus has graduated with the Cloud Native Computing Foundation (CNCF) by demonstrating stability, maturity, and solid governance, it is still evolving at a very rapid pace. At the time of writing, the current stable version of Prometheus is 2.9.2, and every component or feature that is going to be discussed will be based on this version. While there should be no major architectural changes within version 2, care should be taken when applying specific configuration that's been learned from this book to earlier or even later versions.

...

Exposing internal state with exporters

Not all applications are built with Prometheus-compatible instrumentation. Sometimes, no metrics are exposed at all. In these cases, we can rely on exporters. The following diagram shows how they work:

Figure 2.2: A high-level overview of an exporter

An exporter is nothing more than a piece of software that collects data from a service or application and exposes it via HTTP in the Prometheus format. Each exporter usually targets a specific service or application and as such, their deployment reflects this one-to-one synergy.

Nowadays, you can find exporters for pretty much any service you need, and if a particular third-party service doesn't have an exporter available, it's quite simple to build your own.

Exporter fundamentals

...

Alert routing and management with Alertmanager

Alertmanager is the component from the Prometheus ecosystem that's responsible for the notifications that are triggered by the alerts that are generated from the Prometheus server. As such, its availability is of the essence and the design choices reflect this need. It's the only component that's truly conceived to work in a highly available cluster setup, and uses gossip as the communication protocol:

Figure 2.3: A high-level overview of Alertmanager

At a very high level, Alertmanager is a service that receives HTTP POST requests from Prometheus servers via its API, which it then deduplicates and acts on by following a predefined set of routes.

Alertmanager also exposes a web interface to allow, for instance, the visualization and silencing of firing alerts or applying inhibition rules for them.

One of the core design...

Visualizing your data

Data visualization is one of the simplest ways to produce or consume information. Prometheus exposes a well-defined API, where PromQL queries can produce raw data for visualizations.

Currently, the best external software for visualization is Grafana, which we will explain thoroughly in Chapter 10, Discovering and Creating Grafana Dashboards. The Grafana team has made its integration with Prometheus seamless, and the result is a delightful user experience.

The Prometheus server also ships with two internal visualizations components:

  • Expression browser: Here, you can run PromQL directly to quickly query data and visualize it instantly:
Figure 2.4: The Prometheus expression browser interface
  • Consoles: These are web pages that are built using the Golang templating language and are served by the Prometheus server itself. This approach allows you to have...

Summary

To better understand the Prometheus philosophy, it is essential to have an insight even if it's at a high level—into the main components of the Prometheus ecosystem, from data collection via exporters to reliable alerting using Alertmanager, as well as the available visualization options. This is what we covered in this chapter.

In the next chapter, we'll start building a test environment, so that all the concepts we've discussed so far can start to materialize.

Questions

  1. What are the main components of the Prometheus ecosystem?
  2. Which components are essential and which are optional for a Prometheus deployment?
  3. Why are out-of-process exporters needed?
  4. When an HTTP GET request hits the metrics endpoint of an exporter, what ensues?
  5. What happens to a triggering alert in an Alertmanager cluster if a network partition occurs?
  6. You realize you need to integrate Alertmanager with a custom-made API. What would be your quickest option?
  7. What visualizations options are included in a standard Prometheus server installation?

Further reading

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Infrastructure Monitoring with Prometheus
Published in: May 2019Publisher: PacktISBN-13: 9781789612349
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Joel Bastos

Joel Bastos is an open source supporter and contributor, with a background in infrastructure security and automation. He is always striving for the standardization of processes, code maintainability, and code reusability. He has defined, led, and implemented critical, highly available, and fault-tolerant enterprise and web-scale infrastructures in several organizations, with Prometheus as the cornerstone. He has worked at two unicorn companies in Portugal and at one of the largest transaction-oriented gaming companies in the world. Previously, he has supported several governmental entities with projects such as the Public Key Infrastructure for the Portuguese citizen card. You can find his blogs at kintoandar and on Twitter with the handle @kintoandar.
Read more about Joel Bastos

author image
Pedro Araújo

Pedro Arajo is a site reliability and automation engineer and has defined and implemented several standards for monitoring at scale. His contributions have been fundamental in connecting development teams to infrastructure. He is highly knowledgeable about infrastructure, but his passion is in the automation and management of large-scale, highly-transactional systems. Pedro has contributed to several open source projects, such as Riemann, OpenTSDB, Sensu, Prometheus, and Thanos. You can find him on Twitter with the handle @phcrva.
Read more about Pedro Araújo