You're reading from Mastering Prometheus

Product typeBook

Published inApr 2024

PublisherPackt

ISBN-139781805125662

Edition1st Edition

Concepts

DevOps

Author (1)

William Hegedus

Preface

Since the Prometheus project was first announced to the world in January 2015, it has rapidly become the de facto modern monitoring solution. Open source projects such as Kubernetes expose Prometheus metrics by default, cloud providers sell “managed” Prometheus services, and it even has its own yearly conference. However, in my personal journey to learn and understand Prometheus deeper, I came to a saddening realization. There are a plethora of blog posts, books, and tutorials focused on the basics of Prometheus, but few to no readily available resources that cover running Prometheus at scale.

For that information, I found myself needing to turn to conference talks or trying to extrapolate how others do it by reading through GitHub issues on the Prometheus repository, questions on the Prometheus mailing list, and conversations in the official Prometheus Slack channel. Those learnings – coupled with years of personal experience – have gone into this book as an endeavor to begin filling that void.

Perhaps one of the limiting factors of existing content is that it tends to focus purely on Prometheus itself, omitting the larger ecosystem surrounding it. However, to run Prometheus at scale, it quickly becomes necessary to build on top of Prometheus to extend it. Instead of Prometheus being the destination, it provides a foundation.

We will still cover Prometheus itself in depth. We will see how to get the most out of it, implement best practices, and – perhaps most critically – develop a deeper understanding of how Prometheus’s internals work.

In addition to Prometheus itself, though, we’ll also look at how to operate and scale Prometheus. We’ll learn how to debug Prometheus through Go’s developer tools, how to manage hundreds (or thousands) of Prometheus rules without losing your mind, how to connect Prometheus to remote storage solutions, how to run dozens of highly available Prometheus instances while maintaining a global query view, and much, much more.

Who this book is for

This book is primarily intended for readers with a preexisting, basic knowledge of Prometheus. You’re probably already running Prometheus, know what the Node Exporter and Alertmanager are, and can write PromQL queries without too much help.

If this is your first introduction to Prometheus, that’s OK, too! The content should still be generally accessible, but you may – at times – wish to seek out additional resources to explain some of the basics that are glossed over.

Regardless of your experience level, this book is targeted at operators of Prometheus. If you’re a developer who uses a Prometheus environment operated by someone else, there are still chapters that will be of interest to you, but there are others that are unlikely to be applicable to your responsibilities.

What this book covers

Chapter 1, Observability, Monitoring, and Prometheus, gives a brief overview of the history of modern monitoring systems, establishes common observability terminology and concepts, and looks at Prometheus’ role within observability.

Chapter 2, Deploying Prometheus, goes through the process of deploying Prometheus to Kubernetes and provides the lab environment that we will use throughout the rest of the book.

Chapter 3, The Prometheus Data Model and PromQL, dives deep into the technical specifics of how Prometheus – and especially its Time Series DataBase (TSDB) – works, along with an overview of how the Prometheus Query Language (PromQL) works.

Chapter 4, Using Service Discovery, goes into the details of how to use dynamic service discovery in Prometheus, including how to build your own service discovery providers.

Chapter 5, Effective Alerting with Prometheus, focuses on making Prometheus alerting reliable and testable, along with making the most of the Alertmanager.

Chapter 6, Advancing Prometheus: Sharding, Federation, and HA, is where we begin to look at scaling Prometheus past a single Prometheus server and into reliable, distributed deployments.

Chapter 7, Optimizing and Debugging Prometheus, explores how to leverage Go tools to debug the Prometheus application and how to tune Prometheus for optimum performance.

Chapter 8, Enabling Systems Monitoring with the Node Exporter, looks in depth at the most commonly deployed Prometheus exporter to understand all that it can do.

Chapter 9, Utilizing Remote Storage Systems with Prometheus, examines Grafana Mimir and VictoriaMetrics as two options that Prometheus can send data to for long-term storage, global query view, multi-tenancy support, and more.

Chapter 10, Extending Prometheus Globally with Thanos, comprehensively explores all components of the Thanos project to see how they can be used to extend the functionality of Prometheus, enable high availability, and provide for nearly unlimited retention of metrics.

Chapter 11, Jsonnet and Monitoring Mixins, introduces the Jsonnet programming language as a tool to simplify the management of Prometheus rules at scale. Additionally, we see how the Monitoring Mixins project from various Prometheus maintainers and contributors makes use of Jsonnet to provide configurable, reusable alerts and dashboards for various systems and software.

Chapter 12, Utilizing Continuous Integration (CI) Pipelines with Prometheus, takes a practical look at how you can manage your Prometheus configuration and alerts in Git and perform a variety of automated tests to them to ensure they are valid and conform to expectations.

Chapter 13, Defining and Alerting on SLOs, explores how Prometheus can be used to define, measure, and alert on Service Level Objectives (SLOs), including through the use of open source tools such as Pyrra and Sloth that make it easy to implement best-practice SLO alerting.

Chapter 14, Integrating Prometheus with OpenTelemetry, takes a look at the OpenTelemetry project – its history, its future, and how Prometheus integrates with it.

Chapter 15, Beyond Prometheus, brings us full circle back to our initial discussion about observability and provides ideas on where to go next in building out your observability suite.

To get the most out of this book

You will need to have some level of hands-on experience with Prometheus, a basic understanding of server administration, and – ideally – some prior experience with Kubernetes. The lab environment used throughout this book is based on Kubernetes, but all commands used to interact with Kubernetes are explicitly stated. Therefore, prior Kubernetes knowledge is not required, but it will certainly aid in deeper understanding and enable further experimentation outside of what is covered in the book.

Software/hardware covered in the book	Operating system requirements
Prometheus	Windows, macOS, or Linux
Thanos	Windows, macOS, or Linux
OpenTelemetry	Windows, macOS, or Linux
Kubernetes	Windows, macOS, or Linux

If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Mastering-Prometheus. If there’s an update to the code, it will be updated in the GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: “Mount the downloaded WebStorm-10*.dmg disk image file as another disk in your system.”

A block of code is set as follows:

>>> hashB = int(md5(SEPARATOR.join(targetB).encode("utf-8")).hexdigest(), 16)
>>> hashB
139861250730998106692854767707986305935
>>> print(f"{targetA} % {MOD} = ", hashA % MOD)
['app=nginx', 'instance=node2'] % 2 =  0
>>> print(f"{targetB} % {MOD} = ", hashB % MOD)
['app=nginx', 'instance=node23'] % 2 =  1

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

alerting:
  alert_relabel_configs:
    - regex: prometheus_replica
      action: labeldrop

Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: “Select System info from the Administration panel.”

Tips or important notes

Appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, email us at customercare@packtpub.com and mention the book title in the subject of your message.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at copyright@packt.com with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Download a free PDF copy of this book

Thanks for purchasing this book!

Do you like to read on the go but are unable to carry your print books everywhere?

Is your eBook purchase not compatible with the device of your choice?

Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.

Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application.

The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily

Follow these simple steps to get the benefits:

Scan the QR code or visit the link below

https://packt.link/free-ebook/978-1-80512-566-2

Submit your proof of purchase
That’s it! We’ll send your free PDF and other benefits to your email directly

The rest of the chapter is locked

You have been reading a chapter from

Mastering Prometheus

Published in: Apr 2024Publisher: PacktISBN-13: 9781805125662

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

William Hegedus

William Hegedus has worked in tech for over a decade in a variety of roles, culminating in site reliability engineering. He developed a keen interest in Prometheus and observability technologies during his time managing a 24/7 NOC environment and eventually became the first SRE at Linode, one of the foremost independent cloud providers. Linode was acquired by Akamai Technologies in 2022, and now Will manages a team of SREs focused on building the internal observability platform for Akamai's Connected Cloud. His team is responsible for a global fleet of Prometheus servers spanning over two dozen data centers and ingesting millions of data points every second, in addition to operating a suite of other observability tools. Will is an open source advocate and contributor who has contributed code to Prometheus, Thanos, and many other CNCF projects related to Kubernetes and observability. He lives in central Virginia with his wonderful wife, four kids, three cats, two dogs, and a bearded dragon.
Read more about William Hegedus

Personalised recommendations for you

Based on your interests and search pattern

Designing and Implementing Microsoft Azure Networking Solutions

Designing and Implementing Microsoft Azure Networking Solutions Exam Ref AZ-700 is an all-encompassing guide to the AZ-700 exam and contains all the information you need to succeed in the world of virtual networking with Azure. With this book, you will be fully prepared for the exam and the world of cloud networking.

BookAug 2023524 pages

Microsoft 365 Security, Compliance, and Identity Administration

The Microsoft 365 Security, Compliance, and Identity Administration is a comprehensive guide that helps you employ Microsoft 365's robust suite of features and empowers you to optimize your administrative tasks.

BookAug 2023630 pages

Zero Trust Overview and Playbook Introduction

Get started on Zero Trust with this step-by-step playbook and learn everything you need to know for a successful Zero Trust journey with tailored guidance for every role, covering strategy, operations, architecture, implementation, and measuring success. This book will become an indispensable reference for everyone in your organization.

BookOct 2023240 pages

The Self-Taught Cloud Computing Engineer

This self-study book helps you master multiple clouds, including AWS, Azure, and GCP, and serves as a roadmap to becoming a certified cloud computing expert. The book will guide you to develop a professional cloud career by helping you build a broad cloud knowledge base, developing hands-on cloud computing skills, and getting cloud certified.

BookSep 2023472 pages

Technology Operating Models for Cloud and Edge

This book will help you build and create ownership of a technology operating model, as well as connect your leadership with engineering and operations, keeping your internal and external customers in mind. It provides practical tips on why, where, and how to make the cloud and edge platform paradigm sing for you, your team, and your organization.

BookAug 2023228 pages

Azure Architecture Explained

Azure is the preferred platform to build mission-critical and secure apps. This book provides comprehensive coverage of essential Azure products, services, and solutions vital for every solution architect's success. Elevate your knowledge and master the critical components of Azure to excel in your role with Azure Architecture Explained.

BookSep 2023446 pages

Pentesting Active Directory and Windows-based Infrastructure

This practical guide helps you explore the pentesting of Microsoft infrastructure in detail, and enhances your offensive skillset by showing you the different ways to perform security assessment. This book will help blue teamers and IT engineers get up to speed with possible security issues they may encounter in their Windows environments.

BookNov 2023360 pages

Practical Ansible

In Practical Ansible, you'll work with the latest release of Ansible and learn to solve complex issues quickly with the help of task-oriented scenarios. You'll start by installing and configuring Ansible to automate monotonous and repetitive IT tasks and get to grips with concepts such as playbooks, inventories, plugins, collections, and network modules.

BookSep 2023420 pages

Windows 11 for Enterprise Administrators

Microsoft’s launch of Windows 11 is a step toward satisfying the enterprise administrator’s needs for better management and enhanced user experience customization. This book provides the enterprise administrator with the knowledge needed to fully utilize the advanced feature set of Windows 11 Enterprise.

BookOct 2023286 pages

The Linux DevOps Handbook

This book is for software and IT professionals seeking knowledge on Linux systems and DevOps practices. This book will provide you with guidance and tools to learn and gain proficiency in managing Linux-based infrastructures and knowledge of DevOps.

BookNov 2023428 pages2

You're reading from Mastering Prometheus

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Conventions used

Get in touch

Share Your Thoughts

Download a free PDF copy of this book

Unlock this book and the full library FREE for 7 days

Author (1)

Designing and Implementing Microsoft Azure Networking Solutions

Microsoft 365 Security, Compliance, and Identity Administration

The Microsoft 365 Security, Compliance, and Identity Administration is a comprehensive guide that helps you employ Microsoft 365's robust suite of features and empowers you to optimize your administrative tasks.

Zero Trust Overview and Playbook Introduction

The Self-Taught Cloud Computing Engineer

Technology Operating Models for Cloud and Edge

Azure Architecture Explained

Pentesting Active Directory and Windows-based Infrastructure

Practical Ansible

Windows 11 for Enterprise Administrators

The Linux DevOps Handbook

This book is for software and IT professionals seeking knowledge on Linux systems and DevOps practices. This book will provide you with guidance and tools to learn and gain proficiency in managing Linux-based infrastructures and knowledge of DevOps.