Reader small image

You're reading from  Mastering Prometheus

Product typeBook
Published inApr 2024
PublisherPackt
ISBN-139781805125662
Edition1st Edition
Concepts
Right arrow
Author (1)
William Hegedus
William Hegedus
author image
William Hegedus

William Hegedus has worked in tech for over a decade in a variety of roles, culminating in site reliability engineering. He developed a keen interest in Prometheus and observability technologies during his time managing a 24/7 NOC environment and eventually became the first SRE at Linode, one of the foremost independent cloud providers. Linode was acquired by Akamai Technologies in 2022, and now Will manages a team of SREs focused on building the internal observability platform for Akamai's Connected Cloud. His team is responsible for a global fleet of Prometheus servers spanning over two dozen data centers and ingesting millions of data points every second, in addition to operating a suite of other observability tools. Will is an open source advocate and contributor who has contributed code to Prometheus, Thanos, and many other CNCF projects related to Kubernetes and observability. He lives in central Virginia with his wonderful wife, four kids, three cats, two dogs, and a bearded dragon.
Read more about William Hegedus

Right arrow

What this book covers

Chapter 1, Observability, Monitoring, and Prometheus, gives a brief overview of the history of modern monitoring systems, establishes common observability terminology and concepts, and looks at Prometheus’ role within observability.

Chapter 2, Deploying Prometheus, goes through the process of deploying Prometheus to Kubernetes and provides the lab environment that we will use throughout the rest of the book.

Chapter 3, The Prometheus Data Model and PromQL, dives deep into the technical specifics of how Prometheus – and especially its Time Series DataBase (TSDB) – works, along with an overview of how the Prometheus Query Language (PromQL) works.

Chapter 4, Using Service Discovery, goes into the details of how to use dynamic service discovery in Prometheus, including how to build your own service discovery providers.

Chapter 5, Effective Alerting with Prometheus, focuses on making Prometheus alerting reliable and testable, along with making the most of the Alertmanager.

Chapter 6, Advancing Prometheus: Sharding, Federation, and HA, is where we begin to look at scaling Prometheus past a single Prometheus server and into reliable, distributed deployments.

Chapter 7, Optimizing and Debugging Prometheus, explores how to leverage Go tools to debug the Prometheus application and how to tune Prometheus for optimum performance.

Chapter 8, Enabling Systems Monitoring with the Node Exporter, looks in depth at the most commonly deployed Prometheus exporter to understand all that it can do.

Chapter 9, Utilizing Remote Storage Systems with Prometheus, examines Grafana Mimir and VictoriaMetrics as two options that Prometheus can send data to for long-term storage, global query view, multi-tenancy support, and more.

Chapter 10, Extending Prometheus Globally with Thanos, comprehensively explores all components of the Thanos project to see how they can be used to extend the functionality of Prometheus, enable high availability, and provide for nearly unlimited retention of metrics.

Chapter 11, Jsonnet and Monitoring Mixins, introduces the Jsonnet programming language as a tool to simplify the management of Prometheus rules at scale. Additionally, we see how the Monitoring Mixins project from various Prometheus maintainers and contributors makes use of Jsonnet to provide configurable, reusable alerts and dashboards for various systems and software.

Chapter 12, Utilizing Continuous Integration (CI) Pipelines with Prometheus, takes a practical look at how you can manage your Prometheus configuration and alerts in Git and perform a variety of automated tests to them to ensure they are valid and conform to expectations.

Chapter 13, Defining and Alerting on SLOs, explores how Prometheus can be used to define, measure, and alert on Service Level Objectives (SLOs), including through the use of open source tools such as Pyrra and Sloth that make it easy to implement best-practice SLO alerting.

Chapter 14, Integrating Prometheus with OpenTelemetry, takes a look at the OpenTelemetry project – its history, its future, and how Prometheus integrates with it.

Chapter 15, Beyond Prometheus, brings us full circle back to our initial discussion about observability and provides ideas on where to go next in building out your observability suite.

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Mastering Prometheus
Published in: Apr 2024Publisher: PacktISBN-13: 9781805125662

Author (1)

author image
William Hegedus

William Hegedus has worked in tech for over a decade in a variety of roles, culminating in site reliability engineering. He developed a keen interest in Prometheus and observability technologies during his time managing a 24/7 NOC environment and eventually became the first SRE at Linode, one of the foremost independent cloud providers. Linode was acquired by Akamai Technologies in 2022, and now Will manages a team of SREs focused on building the internal observability platform for Akamai's Connected Cloud. His team is responsible for a global fleet of Prometheus servers spanning over two dozen data centers and ingesting millions of data points every second, in addition to operating a suite of other observability tools. Will is an open source advocate and contributor who has contributed code to Prometheus, Thanos, and many other CNCF projects related to Kubernetes and observability. He lives in central Virginia with his wonderful wife, four kids, three cats, two dogs, and a bearded dragon.
Read more about William Hegedus