Reader small image

You're reading from  Practical Site Reliability Engineering

Product typeBook
Published inNov 2018
PublisherPackt
ISBN-139781788839563
Edition1st Edition
Right arrow
Authors (3):
Pethuru Raj Chelliah
Pethuru Raj Chelliah
author image
Pethuru Raj Chelliah

 Pethuru Raj Chelliah (PhD) works as the chief architect at the Site Reliability Engineering Center of Excellence, Reliance Jio Infocomm Ltd. (RJIL), Bangalore. Previously, he worked as a cloud infrastructure architect at the IBM Global Cloud Center of Excellence, IBM India, Bangalore, for four years. He also had an extended stint as a TOGAF-certified enterprise architecture consultant in Wipro Consulting services division and as a lead architect in the corporate research division of Robert Bosch, Bangalore. He has more than 17 years of IT industry experience.
Read more about Pethuru Raj Chelliah

Shreyash Naithani
Shreyash Naithani
author image
Shreyash Naithani

Shreyash Naithani is currently a site reliability engineer at Microsoft R&D. Prior to Microsoft, he worked with both start-ups and mid-level companies. He completed his PG Diploma from the Centre for Development of Advanced Computing, Bengaluru, India, and is a computer science graduate from Punjab Technical University, India. In a short span of time, he has had the opportunity to work as a DevOps engineer with Python/C#, and as a tools developer, site/service reliability engineer, and Unix system administrator. During his leisure time, he loves to travel and binge watch series.
Read more about Shreyash Naithani

Shailender Singh
Shailender Singh
author image
Shailender Singh

Shailender Singh is a principal site reliability engineer and a solution architect with around 11 year's IT experience who holds two master's degrees in IT and computer application. He has worked as a C developer on the Linux platform. He had exposure to almost all infrastructure technologies from hybrid to cloud-hosted environments. In the past, he has worked with companies including Mckinsey, HP, HCL, Revionics and Avalara and these days he tends to use AWS, K8s, Terraform, Packer, Jenkins, Ansible, and OpenShift.
Read more about Shailender Singh

View More author details
Right arrow

Summary


Monitoring is not a one-time task. We should be regularly measuring what's going on with our Kubernetes pods or our microservices. Monitoring plays a crucial role in the microservice system, as we need to monitor all endpoints in our microservices. To achieve a higher quality product, we should be able to detect failures before our customer does. We should enable anomaly detection and notify our operation team to troubleshoot the problem. We have to set up the necessary monitoring and alerts on both the infrastructure side and the application side.In this chapter, we saw how to use Prometheus and Grafana metrics to create powerful dashboards and alerts. 

In the next chapter, we will talk about post-production activities and best practices for ensuring and enhancing the IT reliability.

lock icon
The rest of the page is locked
Previous PageNext Chapter
You have been reading a chapter from
Practical Site Reliability Engineering
Published in: Nov 2018Publisher: PacktISBN-13: 9781788839563

Authors (3)

author image
Pethuru Raj Chelliah

 Pethuru Raj Chelliah (PhD) works as the chief architect at the Site Reliability Engineering Center of Excellence, Reliance Jio Infocomm Ltd. (RJIL), Bangalore. Previously, he worked as a cloud infrastructure architect at the IBM Global Cloud Center of Excellence, IBM India, Bangalore, for four years. He also had an extended stint as a TOGAF-certified enterprise architecture consultant in Wipro Consulting services division and as a lead architect in the corporate research division of Robert Bosch, Bangalore. He has more than 17 years of IT industry experience.
Read more about Pethuru Raj Chelliah

author image
Shreyash Naithani

Shreyash Naithani is currently a site reliability engineer at Microsoft R&D. Prior to Microsoft, he worked with both start-ups and mid-level companies. He completed his PG Diploma from the Centre for Development of Advanced Computing, Bengaluru, India, and is a computer science graduate from Punjab Technical University, India. In a short span of time, he has had the opportunity to work as a DevOps engineer with Python/C#, and as a tools developer, site/service reliability engineer, and Unix system administrator. During his leisure time, he loves to travel and binge watch series.
Read more about Shreyash Naithani

author image
Shailender Singh

Shailender Singh is a principal site reliability engineer and a solution architect with around 11 year's IT experience who holds two master's degrees in IT and computer application. He has worked as a C developer on the Linux platform. He had exposure to almost all infrastructure technologies from hybrid to cloud-hosted environments. In the past, he has worked with companies including Mckinsey, HP, HCL, Revionics and Avalara and these days he tends to use AWS, K8s, Terraform, Packer, Jenkins, Ansible, and OpenShift.
Read more about Shailender Singh