Search icon CANCEL
Subscription
0
Cart icon
Cart
Close icon
You have no products in your basket yet
Save more on your purchases!
Savings automatically calculated. No voucher code required
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
The DevOps 2.2 Toolkit
The DevOps 2.2 Toolkit

The DevOps 2.2 Toolkit: Self-Sufficient Docker Clusters

By Viktor Farcic
$39.99 $27.98
Book Mar 2018 360 pages 1st Edition
eBook
$39.99 $27.98
Print
$34.99
Subscription
$15.99 Monthly
eBook
$39.99 $27.98
Print
$34.99
Subscription
$15.99 Monthly

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now
Table of content icon View table of contents Preview book icon Preview Book

The DevOps 2.2 Toolkit

Introduction to Self-Adapting and Self-Healing Systems

Microservices, microservices, microservices. We are all in the process of rewriting or planning to rewrite our monoliths into microservices. Some of us already did it. We are putting them into containers and deploying them through one of the schedulers. We are marching into a glorious future. There's nothing that can stop us now. Except... We, as an industry, are not yet ready for microservices. One thing is to design our services in a way that they are stateless, fault tolerant, scalable, and so on. The other is to incorporate those services into a system as a whole.

Unless you just started a new project, the chances are that you still did not reach "microservices nirvana" and that quite a few legacy services are floating around. However, for the sake of brevity and the urge to get to the point, I will assume that all the services you're in control of are truly microservices. Does that mean that the whole system reached that nirvana state? Is deployment of a service (no matter who wrote it) entirely independent from the rest of the system? Most likely it isn't.

You are practicing continuous deployment, aren't you? I will assume you are. Now, let's say that you just finished the first release of your new service. That first version is the first commit to your code repository. Your CD tool of choice detected the change in your code repository and started the CD pipeline. At the end of it, the service will be deployed to production. I can see a smile on your face. It's that expression of happiness that can be seen only after a child is born, or a service is deployed to production for the first time. That smile should not be long lasting since deploying a service is only the beginning. It needs to be integrated with the rest of the system. The proxy needs to be reconfigured. Logs parser needs to be updated with the format produced by the new service. Monitoring system needs to become aware of the new service. Alerts need to be created with the goal of sending warning and error notifications when the state of the service reaches certain thresholds. The whole system has to adapt to the new service and incorporate the new variables introduced with the commit we made a few moments ago.

How do we adapt the system so that it takes the new service into account? How do we make that service be an integral part of the system?

Unless you are writing everything yourself (in which case you must be Google), your system consists of a mixture of services developed by you and services written and maintained by others. You probably use a third-party proxy (hopefully that's Docker Flow Proxy - https://proxy.dockerflow.com/). You might have chosen the ELK stack or Splunk for centralized logging. How about monitoring? Maybe it's Nagios, or it might be Prometheus. No matter the choices you made, you are not in control of the architecture of the whole system. Heck, you're probably not even in control of all the services you wrote.

Most of the third-party services are not designed to work in a highly dynamic cluster. When you deployed that first release of the service, you might have had to configure the proxy manually. You might have had to add a few parsing rules to your LogStash config. Your Prometheus targets had to be updated. New alerting rules had to be added. And so on, and so forth. Even if all those tasks are automated, the CD pipeline would have to become too big, and the process would be too flaky.

I will try to be an optimist and assume that you survived the hurdle of configuring all your third-party tools to work seamlessly with the new service. There will be no time to rest since that same service (or some other) will soon be updated. Someone will make a change that will result in a higher memory threshold. That means that, for example, monitoring tool needs to be reconfigured. You might say that's OK since it happens occasionally but that would not be true either. If we adopted microservices and continuous deployment, "occasionally" might mean "on any of the frequent commits." Remember, teams are small, and they are independent. A change that affects the rest of the system might come at any moment, and we need to be ready for it.

Figure 1-1: Traditional deployment where the source of truth about a service is scattered across many different places
Most third-party services were designed in an era when clusters were a collection of static servers. Only a handful of those were designed to work well with containers and even fewer were truly adapted to work with schedulers (for example, Swarm, Kubernetes, or Mesos/Marathon).

One of the major limitations of third-party services is their reliance on static configuration. Take Prometheus as an example. It is, potentially, in charge of monitoring all of your services as well as hardware, networking, and what so not. Each of the targets it observes might have a different set of metrics and a different set of conditions that will fire alerts. Every time we want to add a new target, we need to modify Prometheus configuration and reload it. That means that, for fault tolerance, we have to store that configuration file on a network drive, have some templating mechanism which updates it with every new service and, potentially, with every update of an existing service. So, we would deploy our fancy new service, update the template that generates Prometheus config, create a new config, overwrite the one stored on the network drive, and reload Prometheus. Even that is not enough because data that fuels those templates needs to be stored somewhere meaning that we need to register every service in a service registry (or use the one baked in Docker) and make sure that templating solution reads from it.

Part of the mess could be avoided if Prometheus would be configurable through its API. Still, configuration API more would remove the need for templates but would not eliminate the need for a network drive. Its configuration is its state, and it has to be preserved.

This line of thinking is historical. We are used to monolithic based systems where information is scattered all over the place. We are slowly moving towards a different model. The system is broken into many smaller services, and each of them is a complete source of truth for a problem domain it solves. If you need information about a service, ask for it, or have a mechanism that will push that information to you. A service does not know nor it should care who uses it and how.

The service itself should contain all the data that describes it. If it should reconfigure a proxy, that info should be part of the service. It should contain a pattern it uses to output logs. It should have the addresses of targets that a monitoring tool should scrape from. It should have the info that will be used to launch alerts. In other words, everything that a service needs should be defined in that service. Not somewhere else. The origin of the data we need to adapt a system to the new service should not be distributed across multiple locations, but inside the service we're deploying. Since we are all using containers (aren't we?), the best place to define all that info are service labels.

If your service should be accessible on a path /v1/my-fancy-service, define a label by using argument --label servicePath=/v1/my-fancy-service. If Prometheus should scrape metrics on port 8080, define a label --label scrapePort=8080. And so on and so forth.

Why is all that significant? Among other reasons, when we define all the data a service needs inside that service, we have a single place that contains the complete truth about a service. That makes configuration easier, it makes the team in charge of a service self-sufficient, it makes deployments more manageable and less error prone, and so on and so forth.

Figure 1-2: A service is the single source of truth that, often through an intermediary, announces to the rest of the system its existence

Defining all the info of a service we're developing inside that same service is not a problem. The problem is that most of the third-party services we're using are not designed to leverage that info. Remember, the data about a service needs to be distributed across the cluster. It needs to reach all other services that work in conjunction with the services we're developing and deploying. We do not want to define that info in multiple locations since that increases maintenance costs and introduces potential problems caused by human errors. Instead, we want to define everything inside a service we're deploying and propagate that information throughout the cluster.

We do not want to define and maintain the same info in multiple locations, and we do want to keep that info at the source, but the third-party services are incapable of obtaining that data from the source. If we discard the option of modifying third-party services, the only choice left is to extend them so that they can pull or receive the data they need.

What we truly need are third-party services capable of discovering information from services we are deploying. That discovery can be pull (a service pulls info from another service) or push based (a service acts as a middle-men and pushes data from one service to another). No matter whether discovery relies on push or pull, a service that receives data needs to be able to reconfigure itself. All that needs to be combined with a system that will be able to detect that a service was deployed or updated and notify all interested parties.

The ultimate goal is to design a system that is capable of adapting to any service we throw at it, as well as to changed conditions of a cluster. The final objective is to have a self-adapting and self-healing system that will continue operating efficiently even when we are on vacations.

What is a self-adaptive system?

A self-adaptive system is a system that adapts to changing conditions. That was evident, wasn't it? In practical terms, when operating a cluster and deploying services, that means that a system should adapt when a new service is deployed, or an existing one is updated. When conditions in a cluster change, the whole system should change by adapting to those conditions. If a new service is deployed, the monitoring solution should get the information about that service and change its configuration. A logging system should start processing the logs of that service and parse it correctly. The number of nodes in the cluster should adjust. And so on and so forth. The most important requirement for a system to self-adapt is to create it in such a way that human intervention is not needed. Otherwise, we can just as well change the name from self-adaptive to John-adapts-it system.

What is a self-healing system?

A self-healing system needs to be adaptive. Without the capability to adapt to the changes in the environment, we cannot self-heal. While adaptation is more permanent or longer lasting, healing is a temporary action. Take a number of requests as an example. Let's imagine that it increased permanently because now we have more users or because the new design of the UI is so good that users are spending more using our frontend. As a result of such an increase, our system needs to adapt and permanently (or, at least, longer lastingly) increase the number of replicas of our services. That increase should match the minimum expected load. Maybe we run five replicas of our shopping cart, and that was enough in most circumstances but, since our number of users increased, the number of instances of the shopping cart needs to increase to, let's say, ten replicas. It does not need to be a fixed number. It can, for example, vary from seven (lowest expected load) to twelve (highest expected load).

Self-healing is a reaction to unexpected and has a temporary nature. Take us (humans) as an example. When a virus attacks us, our body reacts and fights it back. Once the virus is annihilated, the state of the emergency ceases and we go back to the normal state. It started with a virus entering and ended once it's removed. A side effect is that we might adapt during the process and permanently create a better immune system. We can apply the same logic to our clusters. We can create processes that will react to external threats and execute reactive measures. Some of those measures will be removed as soon as the threat is gone while others might result in permanent changes to our system.

Self-healing does not always work. Both us (humans) and software systems sometimes need external help. If all else fails, and we cannot self-heal ourselves and eliminate the problem internally, we might go to a doctor. Similarly, if a cluster cannot fix itself it should send a notification to an operator who will, hopefully, be able to fix the problem, write a post-mortem, and improve the system so that the next time the same problem occurs it can self-heal itself.

This need for an external help outlines an effective way to build a self-healing system. We cannot predict all the combinations that might occur in a system. However, what we can do is make sure that when unexpected happens, it is not unexpected for long. A good engineer will try to make himself obsolete. He will try to do the same action only once, and the only way to accomplish that is through an ever-increasing level of automated processes. Everything that is expected should be scripted and fall into self-adapting and self-healing processes executed by the system. We should react only when unexpected happens.

What now?

Let us start building a self-adaptive and self-healing system. The first thing we'll need is metrics. Without metrics, neither the system nor we can make decisions. We'll start by choosing the right tool for the job.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • • Viktor Farcic shows you all aspects in the creation of self-adapting and self-healing systems in both a practical and hands-on approach.
  • • Learn how to choose a successful solution for metrics storage and query, including InfluxDB, Nagios and Sensu, Prometheus and Graphite.
  • • Discover how to integrate Docker Flow Monitor with Docker Flow Proxy.
  • • How to apply Docker self-healing and self-adaptive to both services and infrastructure.

Description

Building on The DevOps 2.0 Toolkit and The DevOps 2.1 Toolkit: Docker Swarm, Viktor Farcic brings his latest exploration of the Docker technology as he records his journey to explore two new programs, self-adaptive and self-healing systems within Docker. The DevOps 2.2 Toolkit: Self-Sufficient Docker Clusters is the latest book in Viktor Farcic’s series that helps you build a full DevOps Toolkit. This book in the series looks at Docker, the tool designed to make it easier in the creation and running of applications using containers. In this latest entry, Viktor combines theory with a hands-on approach to guide you through the process of creating self-adaptive and self-healing systems. Within this book, Viktor will cover a wide-range of emerging topics, including what exactly self-adaptive and self-healing systems are, how to choose a solution for metrics storage and query, the creation of cluster-wide alerts and what a successful self-sufficient system blueprint looks like. Work with Viktor and dive into the creation of self-adaptive and self-healing systems within Docker.

What you will learn

• Let Viktor Farcic show you all aspects in the creation of self-adapting and self-healing systems in both a practical and hands-on approach. • Learn how to choose a successful solution for metrics storage and query, including InfluxDB, Nagios and Sensu, Prometheus and Graphite. • Understand how to integrate Docker Flow Monitor with Docker Flow Proxy. • The creation of cluster-wide alerts by creating alerts based on metrics. • How to apply self-healing and self-adaptive to both services and infrastructure.

Product Details

Country selected

Publication date : Mar 13, 2018
Length 360 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781788991278
Concepts :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now

Product Details


Publication date : Mar 13, 2018
Length 360 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781788991278
Concepts :

Table of Contents

18 Chapters
Preface Chevron down icon Chevron up icon
1. Introduction to Self-Adapting and Self-Healing Systems Chevron down icon Chevron up icon
2. Choosing a Solution for Metrics Storage and Query Chevron down icon Chevron up icon
3. Deploying and Configuring Prometheus Chevron down icon Chevron up icon
4. Scraping Metrics Chevron down icon Chevron up icon
5. Defining Cluster-Wide Alerts Chevron down icon Chevron up icon
6. Alerting Humans Chevron down icon Chevron up icon
7. Alerting the System Chevron down icon Chevron up icon
8. Self-Healing Applied to Services Chevron down icon Chevron up icon
9. Self-Adaptation Applied to Services Chevron down icon Chevron up icon
10. Painting the Big Picture – The Self-Sufficient System Thus Far Chevron down icon Chevron up icon
11. Instrumenting Services Chevron down icon Chevron up icon
12. Self-Adaptation Applied to Instrumented Services Chevron down icon Chevron up icon
13. Setting Up a Production Cluster Chevron down icon Chevron up icon
14. Self-Healing Applied to Infrastructure Chevron down icon Chevron up icon
15. Self-Adaptation Applied to Infrastructure Chevron down icon Chevron up icon
16. Blueprint of a Self-Sufficient System Chevron down icon Chevron up icon
17. Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Empty star icon Empty star icon Empty star icon Empty star icon Empty star icon 0
(0 Ratings)
5 star 0%
4 star 0%
3 star 0%
2 star 0%
1 star 0%
Top Reviews
No reviews found
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.