- In the case of a network partition, each side of the partition will send notifications for the alerts they are aware of: in a clustering failure scenario, it's better to receive duplicate notifications for an issue than to not get any at all.
- By setting continue to true on a route, it will make the matching process keep going through the routing tree until the next match, thereby allowing multiple receivers to be triggered.
- The group_interval configuration defines how long to wait for additional alerts in a given alert group (defined by group_by) before sending an updated notification when a new alert is received; repeat_interval defines how long to wait until resending notifications for a given alert group when there are no changes.
- The top-level route, also known as the catch-all or fallback route, will trigger a default...
- Tech Categories
- Best Sellers
- New Releases
- Books
- Videos
Tech Categories Popular Videos
- Audiobooks
Tech Categories Popular Audiobooks
- Articles
- Newsletters
- Free Learning
You're reading from Hands-On Infrastructure Monitoring with Prometheus
Joel Bastos is an open source supporter and contributor, with a background in infrastructure security and automation. He is always striving for the standardization of processes, code maintainability, and code reusability. He has defined, led, and implemented critical, highly available, and fault-tolerant enterprise and web-scale infrastructures in several organizations, with Prometheus as the cornerstone. He has worked at two unicorn companies in Portugal and at one of the largest transaction-oriented gaming companies in the world. Previously, he has supported several governmental entities with projects such as the Public Key Infrastructure for the Portuguese citizen card. You can find his blogs at kintoandar and on Twitter with the handle @kintoandar.
Read more about Joel Bastos
Pedro Arajo is a site reliability and automation engineer and has defined and implemented several standards for monitoring at scale. His contributions have been fundamental in connecting development teams to infrastructure. He is highly knowledgeable about infrastructure, but his passion is in the automation and management of large-scale, highly-transactional systems. Pedro has contributed to several open source projects, such as Riemann, OpenTSDB, Sensu, Prometheus, and Thanos. You can find him on Twitter with the handle @phcrva.
Read more about Pedro Araújo
Unlock this book and the full library FREE for 7 days
Authors (2)
Joel Bastos is an open source supporter and contributor, with a background in infrastructure security and automation. He is always striving for the standardization of processes, code maintainability, and code reusability. He has defined, led, and implemented critical, highly available, and fault-tolerant enterprise and web-scale infrastructures in several organizations, with Prometheus as the cornerstone. He has worked at two unicorn companies in Portugal and at one of the largest transaction-oriented gaming companies in the world. Previously, he has supported several governmental entities with projects such as the Public Key Infrastructure for the Portuguese citizen card. You can find his blogs at kintoandar and on Twitter with the handle @kintoandar.
Read more about Joel Bastos
Pedro Arajo is a site reliability and automation engineer and has defined and implemented several standards for monitoring at scale. His contributions have been fundamental in connecting development teams to infrastructure. He is highly knowledgeable about infrastructure, but his passion is in the automation and management of large-scale, highly-transactional systems. Pedro has contributed to several open source projects, such as Riemann, OpenTSDB, Sensu, Prometheus, and Thanos. You can find him on Twitter with the handle @phcrva.
Read more about Pedro Araújo