Reader small image

You're reading from  Mastering Prometheus

Product typeBook
Published inApr 2024
PublisherPackt
ISBN-139781805125662
Edition1st Edition
Concepts
Right arrow
Author (1)
William Hegedus
William Hegedus
author image
William Hegedus

William Hegedus has worked in tech for over a decade in a variety of roles, culminating in site reliability engineering. He developed a keen interest in Prometheus and observability technologies during his time managing a 24/7 NOC environment and eventually became the first SRE at Linode, one of the foremost independent cloud providers. Linode was acquired by Akamai Technologies in 2022, and now Will manages a team of SREs focused on building the internal observability platform for Akamai's Connected Cloud. His team is responsible for a global fleet of Prometheus servers spanning over two dozen data centers and ingesting millions of data points every second, in addition to operating a suite of other observability tools. Will is an open source advocate and contributor who has contributed code to Prometheus, Thanos, and many other CNCF projects related to Kubernetes and observability. He lives in central Virginia with his wonderful wife, four kids, three cats, two dogs, and a bearded dragon.
Read more about William Hegedus

Right arrow

Beyond Prometheus

If you’ve made it this far, congratulations! By the power vested in me, I hereby declare you a master of Prometheus. Go forth into the world and observe all the things.

Oh. Are you still here? Did you come back when you remembered that Prometheus by itself doesn’t make a system fully observable? Either way, I want to make the most of the rest of our time together to discuss the question, “What next?”

Hopefully, by now, you’ve come to see how Prometheus (and all metrics systems, by extension) only account for a subset of observability. If you accept the idea that the three core observability signals are metrics, logs, and traces, then solely using Prometheus only gets you one-third of the way toward a fully observable system.

By using Prometheus as our starting point, we need to see what other pieces we can add to make our systems more observable. This chapter will take a fresh look at the observability concepts we discussed...

Technical requirements

The code used in this chapter is available at https://github.com/PacktPublishing/Mastering-Prometheus.

Extending observability past Prometheus

Setting aside the obvious nuance that “no system is truly and fully observable,” let’s focus on how we can get as close as possible to a fully observable system. Observability is all about the concept of being able to account for “unknown unknowns.” In other words, you shouldn’t need to know in advance the various ways that a system can break in order to monitor and observe it effectively.

Knowing what you don’t know

The concept of “unknown unknowns” was popularized by former United States Secretary of Defense, Donald Rumsfeld. It has to do with the idea of not knowing what you don’t know or your ignorance of the extent of your ignorance. In contrast, a “known unknown” would be something that you are aware but do not know. When relating this concept to Prometheus, an “unknown unknown” would be a metric that doesn’t exist, but a known...

Connecting the dots across observability systems

Prometheus is quickly becoming many people’s first introduction to the world of observability technologies. Its popularity and seeming ubiquity have consequently led it to influence many other observability projects that seek to replicate its ease of use and simple data model. This has the added benefit of making it easier to jump between observability services without having to change the filters we use to select data.

The most meaningful way to easily connect the dots across our various observability systems is to ensure consistency in the metadata that is attached to the telemetry coming from that system. In the context of Prometheus, that means that our label keys and values for ServiceX should be the same in Prometheus as they are in the rest of the systems we use. For example, if your Prometheus metrics for ServiceX have a label app=servicex on them, your logs for ServiceX should not have a label of app=service-x on them...

Summary

We’ve come to the end of our time together. I hope that you’ve enjoyed our time together as much as I have and that you’ve learned as much reading this book as I did writing it. Even with all the love I have for Prometheus, it is not the end-all-be-all of observability. It is but a piece in a much larger picture.

What precisely that picture looks like will vary from company to company and over time. As time progresses, new telemetry signals may become popular and be considered “core” observability signals. However, I feel confident in saying that metrics, logs, and traces are all here to stay. If you can build out your observability platform to cover all three of them, you’ll be in a better position to monitor and observe your systems than the majority of people in our industry.

So, go forth with your new-found knowledge and use Prometheus as the cornerstone of your observability platform. Build it, scale it, tweak it, and tune...

Further reading

To learn more about the topics that were covered in this chapter, take a look at the following resources:

  • OpenTelemetry specification: https://opentelemetry.io/docs/specs/otel/
  • OpenTelemetry line protocol specification: https://opentelemetry.io/docs/specs/otlp/
  • OpenTelemetry semantic conventions: https://opentelemetry.io/docs/specs/semconv/
  • Recommended books on OpenTelemetry:
    • Boten, Alex. Cloud-Native Observability With OpenTelemetry: Learn to Gain Visibility Into Systems by Combining Tracing, Metrics, and Logging With OpenTelemetry. Packt Publishing, 2022.
    • Parker, Austin, and Ted Young. Learning OpenTelemetry: Setting up and Operating a Modern Observability System. O’Reilly Media, 2024.
    • Majors, Charity, et al. Observability Engineering. O’Reilly Media, 2022.
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Mastering Prometheus
Published in: Apr 2024Publisher: PacktISBN-13: 9781805125662
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
William Hegedus

William Hegedus has worked in tech for over a decade in a variety of roles, culminating in site reliability engineering. He developed a keen interest in Prometheus and observability technologies during his time managing a 24/7 NOC environment and eventually became the first SRE at Linode, one of the foremost independent cloud providers. Linode was acquired by Akamai Technologies in 2022, and now Will manages a team of SREs focused on building the internal observability platform for Akamai's Connected Cloud. His team is responsible for a global fleet of Prometheus servers spanning over two dozen data centers and ingesting millions of data points every second, in addition to operating a suite of other observability tools. Will is an open source advocate and contributor who has contributed code to Prometheus, Thanos, and many other CNCF projects related to Kubernetes and observability. He lives in central Virginia with his wonderful wife, four kids, three cats, two dogs, and a bearded dragon.
Read more about William Hegedus