Reader small image

You're reading from  Hands-On Infrastructure Monitoring with Prometheus

Product typeBook
Published inMay 2019
PublisherPackt
ISBN-139781789612349
Edition1st Edition
Right arrow
Authors (2):
Joel Bastos
Joel Bastos
author image
Joel Bastos

Joel Bastos is an open source supporter and contributor, with a background in infrastructure security and automation. He is always striving for the standardization of processes, code maintainability, and code reusability. He has defined, led, and implemented critical, highly available, and fault-tolerant enterprise and web-scale infrastructures in several organizations, with Prometheus as the cornerstone. He has worked at two unicorn companies in Portugal and at one of the largest transaction-oriented gaming companies in the world. Previously, he has supported several governmental entities with projects such as the Public Key Infrastructure for the Portuguese citizen card. You can find his blogs at kintoandar and on Twitter with the handle @kintoandar.
Read more about Joel Bastos

Pedro Araújo
Pedro Araújo
author image
Pedro Araújo

Pedro Arajo is a site reliability and automation engineer and has defined and implemented several standards for monitoring at scale. His contributions have been fundamental in connecting development teams to infrastructure. He is highly knowledgeable about infrastructure, but his passion is in the automation and management of large-scale, highly-transactional systems. Pedro has contributed to several open source projects, such as Riemann, OpenTSDB, Sensu, Prometheus, and Thanos. You can find him on Twitter with the handle @phcrva.
Read more about Pedro Araújo

View More author details
Right arrow

Monitoring Fundamentals

This chapter lays the foundation for several key concepts that will be used throughout this book. Starting with the definition of monitoring, we will explore various views and factors that emphasize why systematic analysis assumes different levels of importance and makes an impact on organizations. You will learn about the advantages and disadvantages of different monitoring mechanics, taking a closer look at the Prometheus approach regarding collecting metrics. Finally, we will discuss some of the controversial decisions that were vital for the design and architecture of the Prometheus stack and why you should take them into account when designing your own monitoring system.

We will be covering the following topics in this chapter:

  • Defining of monitoring
  • Whitebox versus blackbox monitoring
  • Understanding metrics collection
...

Definition of monitoring

A consensual definition of monitoring is hard to come by because it quickly shifts between industry- or even job-specific contexts. The diversity of viewpoints, the components comprising the monitoring system, and even how the data is collected or used are all factors that contribute to the struggle of reaching a clear definition.

Without a common ground, it is difficult to sustain a discussion and, usually, expectations are mismatched. Therefore, in the following topics, we will outline a baseline, orientated to obtain a definition of monitoring that will guide us throughout this book.

The value of monitoring

With the growing complexity of infrastructures, exponentially driven by the adoption of...

Whitebox versus blackbox monitoring

There are many ways we could go about monitoring, but they largely fall into two main categories, that is, blackbox and whitebox monitoring.

In blackbox monitoring, the application or host is observed from the outside and, consequently, this approach can be fairly limited. Checks are made to assess whether the system under observation responds to probes in a known way:

  • Does the host respond to Internet Control Message Protocol (ICMP) echo requests (more commonly known as ping)?
  • Is a given TCP port open?
  • Does the application respond with the correct data and status code when it receives a specific HTTP request?
  • Is the process for a specific application running in its host?

On the other hand, in whitebox monitoring, the system under observation surfaces data about its internal state and the performance of critical sections. This type of introspection...

Understanding metrics collection

The process by which metrics are by monitoring systems can generally be divided into two approaches—push and pull. As we'll see in the following topics, both approaches are valid and have their pros and cons, which we will thoroughly discuss. Nonetheless, it is essential to have a solid grasp on how they differ to understand and fully utilize Prometheus. After understanding how collecting metrics works, we will delve into what should be collected. There are several proven methods to achieve this, and we will give an overview of each one.

An overview of the two collection approaches

In push-based monitoring systems, emitted metrics or events are sent either directly from the producing...

Summary

In this chapter, we had the chance to understand the true value of monitoring and how to approach the term in a specific context, including the context that's used in this book. This will help you avoid any misunderstandings and ensure a clear perception of where the book stands on this topic. We also went through different aspects of monitoring, such as metrics, logging, tracing, alerting, and visualizations, while presenting observability and the benefits it brings. Whitebox and blackbox monitoring were addressed, which provide the basis to comprehend the benefits of using metrics. Armed with this knowledge about metrics, we went through the mechanics of push and pull and all the arguments regarding each one, before ending with what the metrics to track on the systems you manage are.

In the next chapter, we will look at an overview of the Prometheus ecosystem, and...

Questions

  1. Why is monitoring definition so hard to clearly define?
  2. Does a high latency of metrics impact the work of a system administrator who's focused on fixing a live incident?
  3. What are the monitoring requirements to properly do capacity planning?
  4. Is logging considered monitoring?
  5. Regarding the available strategies for metrics collection, what are the downsides of using the push-based approach?
  6. If you had to choose three basic metrics from a generic web service to focus on, which would they be?
  7. When a check verifies whether a given process is running on a host by way of listing the running processes in said host, is that whitebox or blackbox monitoring?

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Infrastructure Monitoring with Prometheus
Published in: May 2019Publisher: PacktISBN-13: 9781789612349
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Joel Bastos

Joel Bastos is an open source supporter and contributor, with a background in infrastructure security and automation. He is always striving for the standardization of processes, code maintainability, and code reusability. He has defined, led, and implemented critical, highly available, and fault-tolerant enterprise and web-scale infrastructures in several organizations, with Prometheus as the cornerstone. He has worked at two unicorn companies in Portugal and at one of the largest transaction-oriented gaming companies in the world. Previously, he has supported several governmental entities with projects such as the Public Key Infrastructure for the Portuguese citizen card. You can find his blogs at kintoandar and on Twitter with the handle @kintoandar.
Read more about Joel Bastos

author image
Pedro Araújo

Pedro Arajo is a site reliability and automation engineer and has defined and implemented several standards for monitoring at scale. His contributions have been fundamental in connecting development teams to infrastructure. He is highly knowledgeable about infrastructure, but his passion is in the automation and management of large-scale, highly-transactional systems. Pedro has contributed to several open source projects, such as Riemann, OpenTSDB, Sensu, Prometheus, and Thanos. You can find him on Twitter with the handle @phcrva.
Read more about Pedro Araújo