Home

The DevOps 2.3 Toolkit

By Viktor Farcic

Book

eBook $35.99 $24.99

Print $44.99

Subscription $15.99 $10 p/m for three months

BUY NOW

$10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!

What do you get with a Packt Subscription?

This book & 7000+ ebooks & video courses on 1000+ technologies

60+ curated reading lists for various learning paths

50+ new titles added every month on new and emerging tech

Early Access to eBooks as they are being written

Personalised content suggestions

Customised display settings for better reading experience

50+ new titles added every month on new and emerging tech

Playlists, Notes and Bookmarks to easily manage your learning

Mobile App with offline access

What do you get with a Packt Subscription?

This book & 6500+ ebooks & video courses on 1000+ technologies

60+ curated reading lists for various learning paths

50+ new titles added every month on new and emerging tech

Early Access to eBooks as they are being written

Personalised content suggestions

Customised display settings for better reading experience

50+ new titles added every month on new and emerging tech

Playlists, Notes and Bookmarks to easily manage your learning

Mobile App with offline access

What do you get with eBook + Subscription?

Download this book in EPUB and PDF formats, plus a monthly download credit

This book & 6500+ ebooks & video courses on 1000+ technologies

60+ curated reading lists for various learning paths

50+ new titles added every month on new and emerging tech

Early Access to eBooks as they are being written

Personalised content suggestions

Customised display settings for better reading experience

50+ new titles added every month on new and emerging tech

Playlists, Notes and Bookmarks to easily manage your learning

Mobile App with offline access

What do you get with a Packt Subscription?

This book & 6500+ ebooks & video courses on 1000+ technologies

60+ curated reading lists for various learning paths

50+ new titles added every month on new and emerging tech

Early Access to eBooks as they are being written

Personalised content suggestions

Customised display settings for better reading experience

50+ new titles added every month on new and emerging tech

Playlists, Notes and Bookmarks to easily manage your learning

Mobile App with offline access

What do you get with eBook?

Download this book in EPUB and PDF formats

Access this title in our online reader

DRM FREE - Read whenever, wherever and however you want

Online reader with customised display settings for better reading experience

What do I get with Print?

Get a paperback copy of the book delivered to your specified Address*

Download this book in EPUB and PDF formats

Access this title in our online reader

DRM FREE - Read whenever, wherever and however you want

Online reader with customised display settings for better reading experience

What do I get with Print?

Get a paperback copy of the book delivered to your specified Address*

Access this title in our online reader

Online reader with customised display settings for better reading experience

What do you get with video?

Download this video in MP4 format

Access this title in our online reader

DRM FREE - Watch whenever, wherever and however you want

Online reader with customised display settings for better learning experience

What do you get with video?

Stream this video

Access this title in our online reader

DRM FREE - Watch whenever, wherever and however you want

Online reader with customised display settings for better learning experience

What do you get with Audiobook?

Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF

What do you get with Exam Trainer?

Flashcards, Mock exams, Exam Tips, Practice Questions

Access these resources with our interactive certification platform

Mobile compatible-Practice whenever, wherever, however you want

BUY NOW $10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!

eBook $35.99 $24.99

Print $44.99

Subscription $15.99 $10 p/m for three months

What do you get with a Packt Subscription?

This book & 7000+ ebooks & video courses on 1000+ technologies

60+ curated reading lists for various learning paths

50+ new titles added every month on new and emerging tech

Early Access to eBooks as they are being written

Personalised content suggestions

Customised display settings for better reading experience

50+ new titles added every month on new and emerging tech

Playlists, Notes and Bookmarks to easily manage your learning

Mobile App with offline access

What do you get with a Packt Subscription?

This book & 6500+ ebooks & video courses on 1000+ technologies

60+ curated reading lists for various learning paths

50+ new titles added every month on new and emerging tech

Early Access to eBooks as they are being written

Personalised content suggestions

Customised display settings for better reading experience

50+ new titles added every month on new and emerging tech

Playlists, Notes and Bookmarks to easily manage your learning

Mobile App with offline access

What do you get with eBook + Subscription?

Download this book in EPUB and PDF formats, plus a monthly download credit

This book & 6500+ ebooks & video courses on 1000+ technologies

60+ curated reading lists for various learning paths

50+ new titles added every month on new and emerging tech

Early Access to eBooks as they are being written

Personalised content suggestions

Customised display settings for better reading experience

50+ new titles added every month on new and emerging tech

Playlists, Notes and Bookmarks to easily manage your learning

Mobile App with offline access

What do you get with a Packt Subscription?

This book & 6500+ ebooks & video courses on 1000+ technologies

60+ curated reading lists for various learning paths

50+ new titles added every month on new and emerging tech

Early Access to eBooks as they are being written

Personalised content suggestions

Customised display settings for better reading experience

50+ new titles added every month on new and emerging tech

Playlists, Notes and Bookmarks to easily manage your learning

Mobile App with offline access

What do you get with eBook?

Download this book in EPUB and PDF formats

Access this title in our online reader

DRM FREE - Read whenever, wherever and however you want

Online reader with customised display settings for better reading experience

What do I get with Print?

Get a paperback copy of the book delivered to your specified Address*

Download this book in EPUB and PDF formats

Access this title in our online reader

DRM FREE - Read whenever, wherever and however you want

Online reader with customised display settings for better reading experience

What do I get with Print?

Get a paperback copy of the book delivered to your specified Address*

Access this title in our online reader

Online reader with customised display settings for better reading experience

What do you get with video?

Download this video in MP4 format

Access this title in our online reader

DRM FREE - Watch whenever, wherever and however you want

Online reader with customised display settings for better learning experience

What do you get with video?

Stream this video

Access this title in our online reader

DRM FREE - Watch whenever, wherever and however you want

Online reader with customised display settings for better learning experience

What do you get with Audiobook?

Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF

What do you get with Exam Trainer?

Flashcards, Mock exams, Exam Tips, Practice Questions

Access these resources with our interactive certification platform

Mobile compatible-Practice whenever, wherever, however you want

About this book

Building on The DevOps 2.0 Toolkit, The DevOps 2.1 Toolkit: Docker Swarm, and The DevOps 2.2 Toolkit: Self-Sufficient Docker Clusters, Viktor Farcic brings his latest exploration of the DevOps Toolkit as he takes you on a journey to explore the features of Kubernetes. The DevOps 2.3 Toolkit: Kubernetes is a book in the series that helps you build a full DevOps Toolkit. This book in the series looks at Kubernetes, the tool designed to, among other roles, make it easier in the creation and deployment of highly available and fault-tolerant applications at scale, with zero downtime. Within this book, Viktor will cover a wide range of emerging topics, including what exactly Kubernetes is, how to use both first and third-party add-ons for projects, and how to get the skills to be able to call yourself a “Kubernetes ninja.” Work with Viktor and dive into the creation and exploration of Kubernetes with a series of hands-on guides.

Publication date:: September 2018
Publisher: Packt
Pages: 418
ISBN: 9781789135503
Download code from GitHub

How Did We Get Here?

A small percentage of companies live in the present. Most of us are stuck in the past, with obsolete technology and outdated processes. If we stay in the past for too long, we might lose our chance to come back to the present. We might move into an alternate timeline and cease to exist.

Every company is a software company. That applies even to those that do not yet realize it. We are all running and continuously increasing our speed. It's a race without a finish line. There are no winners but rather those that fall and do not get up. We live an era of an ever-increasing speed of change. Companies are created and destroyed overnight. No one is safe. No one can allow status quo.

Technology is changing so fast that it is very hard, if not impossible to follow. The moment we learn about a new technology, it is already obsolete and replaced with something else. Take containers as an example. Docker appeared only a few years ago, and everyone is already using it for a myriad of scenarios. Still, even though it is a very young product, it changed many times over. Just when we learned how to use docker run, we were told that it is obsolete and should be replaced with docker-compose up. We started converting all our docker run commands into Compose YAML format. The moment we finished the conversion, we learned that containers should not be run directly. We should use a container scheduler instead. To make things more complicated, we had to make a selection between Mesos and Marathon, Docker Swarm, or Kubernetes.

We can choose to ignore the trends but that would mean that we would fall behind the rest of the competition. There is no alternative to a constant struggle to be competitive. Once we drop our guard and stop learning and improving, the competition will take over our business. Everyone is under pressure to improve, even highly regulated industries. Innovation is impossible until we manage to get to the present tense. Only once we master what others are doing today, can we move forward and come up with something new. Today, container schedulers are a norm. They are not the thing of the future. They are the present. They are here to stay even though it is likely that they will change a lot in the coming months and years. Understanding container schedulers are paramount. Among them, Kubernetes is the most widely used and with a massive community behind it.

Before we dive into Kubernetes, it might be worthwhile going through some history in an attempt to understand some of the problems we were trying to solve, as well as some of the challenges we were facing.

A glimpse from the past

Picture a young boy. He just finished a few months worth of work. He's proud of what he accomplished but, at the same time, fearful whether it will work. He did not yet try it out on a "real" server. This will be the first time he'll deliver the fruits of his work.

He takes a floppy disk out from a drawer, inserts it into his computer, and copies the files he compiled previously. He feels fortunate that perforated cards are a thing of the past.

He gets up from his desk, exits the office, and walks towards his car. It will take him over two hours to get to the building with servers. He's not happy with the prospect of having to drive for two hours, but there is no better alternative. He could have sent the floppy with a messenger, but that would do no good since he wants to install the software himself. He needs to be there. There is no remote option.

A while later, he enters the room with the servers, inserts the floppy disk, and copies and installs the software. Fifteen minutes later, his face shows signs of stress. Something is not working as expected. There is an unforeseen problem. He's collecting outputs and writing notes. He's doing his best to stay calm and gather as much info as he can. He's dreading a long ride back to his computer and days, maybe even weeks, until he figures out what caused the problem and fixes it. He'll be back and install the fix. Perhaps it will work the second time. More likely it won't.

A short history of infrastructure management

A long time ago in a galaxy far, far away...

We would order servers and wait for months until they arrive. To make our misery worse, even after they come, we'd wait for weeks, sometimes even months, until they are placed in racks and provisioned. Most of the time we were waiting for something to happen. Wait for servers, wait until they are provisioned, wait until you get approval to deploy, then wait some more. Only patient people could be software engineers. And yet, that was the time after perforated cards and floppy disks. We had internet or some other way to connect to machines remotely. Still, everything required a lot of waiting.

Given how long it would take to have a fully functioning server, it came as no surprise that only a select few had access to them. If someone does something that should not be done, we could face an extended downtime. On top of that, nobody knew what was running on those servers. Since everything was being done manually, after a while, those servers would become a dumping ground. Things get accumulated over time. No matter how much effort is put into documentation, given enough time, the state of the servers would always diverge from the documentation. That is the nature of manual provisioning and installations. Sysadmin became a god-like person. He was the only one who knew everything or, more likely, faked that he does. He was the dungeon keeper. He had the keys to the kingdom. Everyone was replaceable but him.

Then came configuration management tools. We got CFEngine. It was based on promise theory and was capable of putting a server into the desired state no matter what its actual state was. At least, that was the theory. Even with its shortcomings, CFEngine fulfilled its primary objective. It allowed us to specify the state of static infrastructure and have a reasonable guarantee that it will be achieved. Aside from its main goal, it was an advance towards documented servers setup. Instead of manual hocus-pocus type of actions which resulted in often significant discrepancies between documentation and the actual state, CFEngine allowed us to have a specification that (almost) entirely matches the actual state. Another big advantage it provided is the ability to have, more or less, the same setup for different environments. Servers dedicated to testing could be (almost) the same as those assigned to production. Unfortunately, usage of CFEngine and similar tools were not yet widespread. We had to wait for virtual machines before automated configuration management become a norm. However, CFEngine was not designed for virtual machines. They were meant to work with static, bare metal servers. Still, CFEngine was a massive contribution to the industry even though it failed to get widespread adoption.

After CFEngine came Chef, Puppet, Ansible, Salt, and other similar tools. Life was good until virtual machines came into being or, to be more precise, became widely used. We'll go back to those tools soon. For now, let's turn to the next evolutionary improvement.

Besides forcing us to be patient, physical servers were a massive waste in resource utilization. They came in predefined sizes and, since waiting time was considerable, we often opted for big ones. The bigger, the better. That meant that an application or a service usually required less CPU and memory than the server offered. Unless you do not care about costs, that meant that we'd deploy multiple applications to a single server. The result was a dependencies nightmare. We had to choose between freedom and standardization.

Freedom meant that different applications could use different runtime dependencies. One service could require JDK3 while the other might need JDK4. A third one might be compiled with C. You probably understand where this is going. The more applications we host on a single server, the more dependencies there are. More often than not, those dependencies were conflicting and would produce side effects no one expected. Thanks to our inherent need to convert any expertise into a separate department, those in charge of infrastructure were quick to dismiss freedom in favour of reliability. That translates into "the easier it is for me, the more reliable it is for you." Freedom lost, standardization won.

Standardization starts with systems architects deciding the only right way to develop and deploy something. They are a curious bunch of people. With the risk of putting everyone in the same group and ridiculing the profession, I'll describe an average systems architect as a (probably experienced) coder that decided to climb his company's ladder. While on the subject of ladders, there are often two of those. One is the management ladder that requires an extensive knowledge of Microsoft Word and Excel. Expert knowledge of all MS Office tools is a bonus. Those who mastered MS Project were considered the ultimate experts. Oh, I forgot about email skills. They had to be capable of sending at least fifteen emails a day asking for status reports.

Most expert coders (old timers) would not choose that path. Many preferred to remain technical. That meant taking over systems architect role. The problem is that the "technical path" was often a deceit. Architects would still have to master all the management skills (for example, Word, Excel, and email) with the additional ability to draw diagrams. That wasn't easy. A systems architect had to know how to draw a rectangle, a circle, and a triangle. He had to be proficient in coloring them as well as in connecting them with lines. There were dotted and full lines. Some had to end like an arrow. Choosing the direction of an arrow was a challenge in itself so the lines would often end up with arrows at both ends.

The important part of being an architect is that drawing diagrams and writing countless pages of Word documents was so time demanding, that coding stopped being something they do. They stopped learning and exploring beyond Google search and comparative tables. The net result is that the architecture would reflect knowledge an architect had before they jumped to the new position.

Why am I talking about architects? The reason is simple. They were in charge of standardization demanded by sysadmins. They would draw their diagrams and choose the stack, developers would use. Whatever that stack was, it was to be considered Bible and followed to the letter. Sysadmins were happy since there was a standard and a predefined way to set up a server. Architects were thrilled because their diagrams served a purpose. Since those stacks were supposed to last forever, developers were excited since there was no need for them to learn anything new. Standardization killed innovation, but everyone was happy. Happiness is necessary, isn't it? Why do we need Java 6 if JDK2 works great? It's been proven by countless diagrams.

Then came Virtual machines and broke everyone's happiness.

Virtual machines (VMs) were a massive improvement over bare metal infrastructure. They allowed us to be more precise with hardware requirements. They could be created and destroyed quickly. They could differ. One could host Java application, and the other could be dedicated to Ruby on Rails. We could get them in a matter of minutes, instead of waiting for months. Still, it took quite a while until "could" became "can". Even though the advantages brought by VMs were numerous, years passed until they were widely adopted. Even then, the adoption was usually wrong. Companies often moved the same practices used with bare metal servers into virtual machines. That is not to say that adopting VMs did not bring immediate value. Waiting time for servers dropped from months to weeks. If it wasn't for administrative tasks, manual operations, and operational bottlenecks, they could have reduced waiting time to minutes. Still, waiting for weeks was better than waiting for months. Another benefit is that we could have identical servers in different environments. Companies started copying VMs. While that was much better than before, it did not solve the problem of missing documentation and the ability to create VMs from scratch. Still, multiple identical environments are better than one, even if that meant that we don't know what's inside.

While the adoption of VMs was increasing, so did the number of configuration management tools. We got Chef, Puppet, Ansible, Salt, and so on. Some of them might have existed before VMs. Still, virtual machines made them popular. They helped spread the adoption of "infrastructure as code" principles. However, those tools were based on the same principles as CFEngine. That means that they were designed with static infrastructure in mind. On the other hand, VMs opened the doors to dynamic infrastructure where VMs are continuously created and destroyed. Mutability and constant creation and destruction were clashing. Mutable infrastructure is well suited for static infrastructure. It does not respond well to challenges brought with dynamic nature of modern data centers. Mutability had to give way to immutability.

When ideas behind immutable infrastructure started getting traction, people began combining them with the concepts behind configuration management. However, tools available at that time were not fit for the job. They (Chef, Puppet, Ansible, and the like) were designed with the idea that servers are brought into the desired state at runtime. Immutable processes, on the other hand, assume that (almost) nothing is changeable at runtime. Artifacts were supposed to be created as immutable images. In case of infrastructure, that meant that VMs are created from images, and not changed at runtime. If an upgrade is needed, new image should be created followed with a replacement of old VMs with new ones based on the new image. Such processes brought speed and reliability. With proper tests in place, immutable is always more reliable than mutable.

Hence, we got tools capable of building VM images. Today, they are ruled by Packer. Configuration management tools quickly jumped on board, and their vendors told us that they work equally well for configuring images as servers at runtime. However, that was not the case due to the logic behind those tools. They are designed to put a server that is in an unknown state into the desired state. They assume that we are not sure what the current state is. VM images, on the other hand, are always based on an image with a known state. If for example, we choose Ubuntu as a base image, we know what's inside it. Adding additional packages and configurations is easy. There is no need for things like "if this then that, otherwise something else." A simple shell script is as good as any configuration management tool when the current state is known. Creating a VM image is reasonably straightforward with Packer alone. Still, not all was lost for configuration management tools. We could still use them to orchestrate the creation of VMs based on images and, potentially, do some runtime configuration that couldn't be baked in. Right?

The way we orchestrate infrastructure had to change as well. A higher level of dynamism and elasticity was required. That became especially evident with the emergence of cloud hosting providers like Amazon Web Services (AWS) and, later on, Azure and GCE. They showed us what can be done. While some companies embraced the cloud, others went into defensive positions. "We can build an internal cloud", "AWS is too expensive", "I would, but I can't because of legislation", and "our market is different", are only a few ill-conceived excuses often given by people who are desperately trying to maintain status quo. That is not to say that there is no truth in those statements but that, more often than not, they are used as an excuse, not for real reasons.

Still, the cloud did manage to become the way to do things, and companies moved their infrastructure to one of the providers. Or, at least, started thinking about it. The number of companies that are abandoning on-premise infrastructure is continuously increasing, and we can safely predict that the trend will continue. Still, the question remains. How do we manage infrastructure in the cloud with all the benefits it gives us? How do we handle its highly dynamic nature? The answer comes in the form of vendor-specific tools like CloudFormation or agnostic solutions like Terraform. When combined with tools that allow us to create images, they represent a new generation of configuration management. We are talking about full automation backed by immutability.

We're living in an era without the need to SSH into servers.

Today, modern infrastructure is created from immutable images. Any upgrade is performed by building new images and performing rolling updates that will replace VMs one by one. Infrastructure dependencies are never changed at runtime. Tools like Packer, Terraform, CloudFormation, and the like are the answer to today's problems.

One of the inherent benefits behind immutability is a clear division between infrastructure and deployments. Until not long ago, the two meshed together into an inseparable process. With infrastructure becoming a service, deployment processes can be clearly separated, thus allowing different teams, individuals, and expertise to take control.

We'll need to go back in time one more time and discuss the history of deployments. Did they change as much as infrastructure?

A short history of deployment processes

In the beginning, there were no package managers. There were no JAR, WAR, RPM, DEB, and other package formats. At best, we could zip files that form a release. More likely, we'd manually copy files from one place to another. When this practice is combined with bare-metal servers which were intended to last forever, the result was living hell. After some time, no one knew what was installed on the servers. Constant overwrites, reconfigurations, package installations, and mutable types of actions resulted in unstable, unreliable, and undocumented software running on top of countless OS patches.

The emergence of configuration management tools (for example, CFEngine, Chef, Puppet, and so on) helped to decrease the mess. Still, they improved OS setups and maintenance, more than deployments of new releases. They were never designed to do that even though the companies behind them quickly realized that it would be financially beneficial to extend their scope.

Even with configuration management tools, the problems with having multiple services running on the same server persisted. Different services might have different needs, and sometimes those needs clash. One might need JDK6 and the other JDK7. A new release of the first one might require JDK to be upgraded to a new version, but that might affect some other service on the same server. Conflicts and operational complexity were so common that many companies would choose to standardize. As we discussed, standardization is innovation killer. The more we standardize, the less room there is for coming up with better solutions. Even if that's not a problem, standardization with clear isolation means that it is very complicated to upgrade something. Effects could be unforeseen and the sheer work involved to upgrade everything at once is so significant that many choose not to upgrade for a long time (if ever). Many end up stuck with old stacks for a long time.

We needed process isolation that does not require a separate VM for each service. At the same time, we had to come up with an immutable way to deploy software. Mutability was distracting us from our goal to have reliable environments. With the emergence of virtual machines, immutability became feasible. Instead of deploying releases by doing updates at runtime, we could create new VMs with not only OS and patches but also our own software baked in. Each time we wanted to release something, we could create a new image, and instantiate as many VMs as we need. We could do immutable rolling updates. Still, not many of us did that. It was too expensive, both regarding resources as well as time. The process was too long. Even if that would not matter, having a separate VM for each service would result in too much unused CPU and memory.

Fortunately, Linux got namespaces, cgroups, and other things that are together known as containers. They were lightweight, fast, and cheap. They provided process isolation and quite a few other benefits. Unfortunately, they were not easy to use. Even though they've been around for a while, only a handful of companies had the know-how required for their beneficial utilization. We had to wait for Docker to emerge to make containers easy to use and thus accessible to all.

Today, containers are the preferable way to package and deploy services. They are the answer to immutability, we were so desperately trying to implement. They provide necessary isolation of processes, optimized resource utilization, and quite a few other benefits. And yet, we already realized that we need much more. It's not enough to run containers. We need to be able to scale them, to make them fault tolerant, to provide transparent communication across a cluster, and many other things. Containers are only a low-level piece of the puzzle. The real benefits are obtained with tools that sit on top of containers. Those tools are today known as container schedulers. They are our interface. We do not manage containers, they do.

In case you are not already using one of the container schedulers, you might be wondering what they are.

What is a container scheduler?

Picture me as a young teenager. After school, we'd go a courtyard and play soccer. That was an exciting sight. A random number of us running around the yard without any orchestration. There was no offense and no defense. We'd just run after a ball. Everyone moves forward towards the ball, someone kicks it to the left, and we move in that direction, only to start running back because someone kicked the ball again. The strategy was simple. Run towards the ball, kick it if you can, wherever you can, repeat. To this day I do not understand how anyone managed to score. It was complete randomness applied to a bunch of kids. There was no strategy, no plan, and no understanding that winning required coordination. Even a goalkeeper would be in random locations on the field. If he caught the ball around the goal he's guarding, he'd continue running with the ball in front of him. Most of the goals were scored by shooting at an empty goal. It was an "every man for himself" type of ambition. Each one of us hoped to score and bring glory to his or her name. Fortunately, the main objective was to have fun so winning as a team did not matter that much. If we were a "real" team, we'd need a coach. We'd need someone to tell us what the strategy is, who should do what, and when to go on the offense or fall back to defend the goal. We'd need someone to orchestrate us. The field (a cluster) had a random number of people (services) with the common goal (to win). Since anyone could join the game at any time, the number of people (services) was continually changing.

Someone would be injured and would have to be replaced or, when there was no replacement, the rest of us would have to take over his tasks (self-healing). Those football games can be easily translated into clusters. Just as we needed someone to tell us what to do (a coach), clusters need something to orchestrate all the services and resources. Both need not only to make up-front decisions, but also to continuously watch the game/cluster, and adapt the strategy/scheduling depending on the internal and external influences. We needed a coach and clusters need a scheduler. They need a framework that will decide where a service should be deployed and make sure that it maintains the desired run-time specification.

A cluster scheduler has quite a few goals. It's making sure that resources are used efficiently and within constraints. It's making sure that services are (almost) always running. It provides fault tolerance and high availability. It makes sure that the specified number of replicas are deployed. The list can go on for a while and varies from one solution to another. Still, no matter the exact list of cluster scheduler's responsibilities, they can be summarized through the primary goal. A scheduler is making sure that the desired state of a service or a node is (almost) always fulfilled. Instead of using imperative methods to achieve our goals, with schedulers we can be declarative. We can tell a scheduler what the desired state is, and it will do its best to ensure that our desire is (almost) always fulfilled. For example, instead of executing a deployment process five times hoping that we'll have five replicas of a service, we can tell a scheduler that our desired state is to have the service running with five replicas.

The difference between imperative and declarative methods might seem subtle but, in fact, is enormous. With a declarative expression of the desired state, a scheduler can monitor a cluster and perform actions whenever the actual state does not match the desired. Compare that to an execution of a deployment script. Both will deploy a service and produce the same initial result. However, the script will not make sure that the result is maintained over time. If an hour later, one of the replicas fail, our system will be compromised. Traditionally, we were solving that problem with a combination of alerts and manual interventions. An operator would receive a notification that a replica failed, he'd login to the server, and restart the process. If the whole server is down, the operator might choose to create a new one, or he might deploy the failed replica to one of the other servers. But, before doing that, he'd need to check which server has enough available memory and CPU. All that, and much more, is done by schedulers without human intervention. Think of schedulers as operators who are continually monitoring the system and fixing discrepancies between the desired and the actual state. The difference is that schedulers are infinitely faster and more accurate. They do not get tired, they do not need to go to the bathroom, and they do not require paychecks. They are machines or, to be more precise, software running on top of them.

That leads us to container schedulers. How do they differ from schedulers in general?

Container schedulers are based on the same principles as schedulers in general. The significant difference is that they are using containers as the deployment units. They are deploying services packaged as container images. They are trying to collocate them depending on desired memory and CPU specifications. They are making sure that the desired number of replicas are (almost) always running. All in all, they do what other schedulers do but with containers as the lowest and the only packaging unit. And that gives them a distinct advantage. They do not care what's inside. From scheduler's point of view, all containers are the same.

Containers provide benefits that other deployment mechanisms do not. Services deployed as containers are isolated and immutable. Isolation provides reliability. Isolation helps with networking and volume management. It avoids conflicts. It allows us to deploy anything, anywhere, without worrying whether that something will clash with other processes running on the same server. Schedulers, combined with containers and virtual machines, provide the ultimate cluster management nirvana. That will change in the future but, for now, container schedulers are the peak of engineering accomplishments. They allow us to combine the developer's necessity for rapid and frequent deployments with a sysadmin's goals of stability and reproducibility. And that leads us to Kubernetes.

What is Kubernetes?

To understand Kubernetes, it is important to realize that running containers directly is a bad option for most use cases. Containers are low-level entities that require a framework on top. They need something that will provide all the additional features we expect from services deployed to clusters. In other words, containers are handy but are not supposed to be run directly. The reason is simple. Containers, by themselves, do not provide fault tolerance. They cannot be deployed easily to the optimum spot in a cluster, and, to cut a long story short, are not operator friendly. That does not mean that containers by themselves are not useful. They are, but they require much more if we are to harness their real power. If we need to operate containers at scale and if we need them to be fault tolerant and self-healing, and have the other features we expect from modern clusters, we need more. We need at least a scheduler, probably more.

Kubernetes was first developed by a team at Google. It is based on their experience from running containers at scale for years. Later on, it was donated to Cloud Native Computing Foundation (CNCF) (https://www.cncf.io/). It is a true open source project with probably the highest velocity in history.

Kubernetes is a container scheduler and quite a lot more. We can use it to deploy our services, to roll out new releases without downtime, and to scale (or de-scale) those services. It is portable. It can run on a public or private cloud. It can run on-premise or in a hybrid environment. Kubernetes, in a way, makes your infrastructure vendor agnostic. We can move a Kubernetes cluster from one hosting vendor to another without changing (almost) any of the deployment and management processes. Kubernetes can be easily extended to serve nearly any needs. We can choose which modules we'll use, and we can develop additional features ourselves and plug them in.

If we choose to use Kubernetes, we decide to relinquish control. Kubernetes will decide where to run something and how to accomplish the state we specify. Such control allows Kubernetes to place replicas of a service on the most appropriate server, to restart them when needed, to replicate them, and to scale them. We can say that self-healing is a feature included in its design from the start. On the other hand, self-adaptation is coming as well. At the time of this writing, it is still in its infancy. Soon it will be an integral part of the system.

Zero-downtime deployments, fault tolerance, high availability, scaling, scheduling, and self-healing should be more than enough to see the value in Kubernetes. Yet, that is only a fraction of what it provides. We can use it to mount volumes for stateful applications. It allows us to store confidential information as secrets. We can use it to validate the health of our services. It can load balance requests and monitor resources. It provides service discovery and easy access to logs. And so on and so forth. The list of what Kubernetes does is long and rapidly increasing. Together with Docker, it is becoming a platform that envelops whole software development and deployment lifecycle.

The Kubernetes project has just started. It is in its infancy, and we can expect vast improvements and new features coming soon. Still, do not be fooled with "infancy". Even though the project is young, it has one of the biggest communities behind it and is used in some of the biggest clusters in the world. Do not wait. Adopt it now!

About the Author

Viktor Farcic

Viktor Farcic is a senior consultant at CloudBees, a member of the Docker Captains group, and an author. He codes using a plethora of languages starting with Pascal (yes, he is old), Basic (before it got the Visual prefix), ASP (before it got the .NET suffix), C, C++, Perl, Python, ASP.NET, Visual Basic, C#, JavaScript, Java, Scala, and so on. He never worked with Fortran. His current favorite is Go. Viktor's big passions are Microservices, Continuous Deployment, and Test-Driven Development (TDD). He often speaks at community gatherings and conferences. Viktor wrote Test-Driven Java Development by Packt Publishing, and The DevOps 2.0 Toolkit. His random thoughts and tutorials can be found in his blog—Technology Conversations
Browse publications by this author

must read this ook!......

Prompt delivery of books + in good quality