In this chapter, we will discuss the following topics:
Why Docker has been so widely accepted by the entire industry
What does a typical container's life cycle look like?
What plugins and third-party tools will be covered in the upcoming chapters?
What will you need for the remainder of the chapters?
Not very often does a technology come along that is adopted so widely across an entire industry. Since its first public release in March 2013, Docker has not only gained the support of both end users, like you and I, but also industry leaders such as Amazon, Microsoft, and Google.
Docker is currently using the following sentence on their website to describe why you would want to use it:
"Docker provides an integrated technology suite that enables development and IT operations teams to build, ship, and run distributed applications anywhere."
There is a meme, based on the disaster girl photo, which sums up why such a seemingly simple explanation is actually quite important:
So as simple as Docker's description sounds, it's actually a been utopia for most developers and IT operations teams for a number of years to have tool that can ensure that an application can consistently work across the following three main stages of an application's life cycle:
Staging and Preproduction
To illustrate why this used to be a problem before Docker arrived at the scene, let's look at how the services were traditionally configured and deployed. People tended to typically use a mixture of dedicated machines and virtual machines. So let's look at these in more detail.
While this is possible using configuration management tools, such as Puppet, or orchestration tools, such as Ansible, to maintain consistency between server environments, it is difficult to enforce these across both servers and a developer's workstation.
Traditionally, these are a single piece of hardware that have been configured to run your application, while the applications have direct access to the hardware, you are constrained by the binaries and libraries you can install on a dedicated machine, as they have to be shared across the entire machine.
To illustrate one potential problem Docker has fixed, let's say you had a single dedicated server that was running your PHP application. When you initially deployed the dedicated machine, all three of the applications, which make up your e-commerce website, worked with PHP 5.6, so there was no problem with compatibility.
Your development team has been slowly working through the three PHP applications. You have deployed it on your host to make them work with PHP 7, as this will give them a good boost in performance. However, there is a single bug that they have not been able to resolve with App2, which means that it will not run under PHP 7 without crashing when a user adds an item to their shopping cart.
If you have a single host running your three applications, you will not be able to upgrade from PHP 5.6 to PHP 7 until your development team has resolved the bug with App2, unless you do one of the following:
Deploy a new host running PHP 7 and migrate App1 and App3 to it; this could be both time consuming and expensive
Deploy a new host running PHP 5.6 and migrate App2 to it; again this could be both time consuming and expensive
Wait until the bug has been fixed; the performance improvements that the upgrade from PHP 5.6 to PHP 7 bring to the application could increase the sales and there is no ETA for the fix
If you go for the first two options, you also need to ensure that the new dedicated machine either matches the developer's PHP 7 environment or that a new dedicated machine is configured in exactly the same way as your existing environment; after all, you don't want to introduce further problems by having a poorly configured machine.
VMware vSphere: http://www.vmware.com/uk/products/vsphere-hypervisor/
Going back to the scenario given in the dedicated machine section, you will be able to upgrade to PHP 7 on the virtual machines with App1 and App2 installed, while leaving App2 untouched and functional while the development work on the fix.
Great, so what is the catch? From the developer's view, there is none as they have their applications running with the PHP versions, which work best for them; however, from an IT operations point of view:
More CPU, RAM, and disk space: Each of the virtual machines will require additional resources as the overhead of running three guest OS, as well as the three applications have to be taken into account
More management: IT operations now need to patch, monitor, and maintain four machines, the dedicated host machine along with three virtual machines, where as before they only had a single dedicated host.
As earlier, you also need to ensure that the configuration of the three virtual machines that are hosting your applications match the configuration that the developers have been using during the development process; again, you do not want to introduce additional problems due to configuration and process drift between departments.
As you can see, the biggest differences between the two are quite clear. You are making a trade-off between resource utilization and being able to run your applications using different binaries/libraries.
Back to our scenario of the three applications running on a single host machine. Installing Docker on the host and then deploying each of the applications as a container on this host gives you the benefits of the virtual machine, while vastly reducing the footprint, that is, removing the need for the hypervisor and guest operating system completely, and replacing them with a SlimLine interface directly into the host machines kernel.
The advantages this gives both the IT operations and development teams are as follows:
Low overhead: As mentioned already, the resource and management for the IT operations team is lower
Development provide the containers: Rather than relying on the IT operations team to configure each of the three applications environments to machine the development environment, they can simply pass their containers to be put into production
As you can see from the following diagram, the layers between the application and host operating system have been reduced:
All of this means that the need to use the disaster girl meme at the beginning of this chapter should be now redundant as the development team are shipping the application to the operations in a container with all the configuration, binaries, and libraries intact, which means that if it works in development, it will work in production.
This may seem too good to be true, and to be honest, there is a "but". For most web applications or applications that are pre-compiled static binaries, you shouldn't have a problem.
However, as Docker shares resources with the underlying host machine, such as the Kernel version, if your application needs to be compiled or have a reliance on certain libraries that are only compatible with the shared resources, then you will have to deploy your containers on a like-for-like operating system, and in some cases, hardware.
Docker has tried to address this issue with the acquisition of a company called Unikernel Systems in January 2016. At the time of writing this book, not a lot is known about how Docker is planning to integrate this technology into their core product, if at all. You can find out more about this technology at https://blog.docker.com/2016/01/unikernel/.
So, is it really that simple, should everyone stop using virtual machines and use containers instead?
In July 2014, Wes Felter, Alexandre Ferreira, Ram Rajamony, and Juan Rubio published an IBM research report titled An Updated Performance Comparison of Virtual Machines and Linux Containers and concluded:
"Both VMs and containers are mature technology that have benefited from a decade of incremental hardware and software optimizations. In general, Docker equals or exceeds KVM performance in every case we tested. Our results show that both KVM and Docker introduce negligible overhead for CPU and memory performance (except in extreme cases). For I/O intensive workloads, both forms of virtualization should be used carefully."
It then goes on to say the following:
"Although containers themselves have almost no overhead, Docker is not without performance gotchas. Docker volumes have noticeably better performance than files stored in AUFS. Docker's NAT also introduces overhead for workloads with high packet rates. These features represent a tradeoff between ease of management and performance and should be considered on a case-by-case basis."
The full 12-page report, which is an interesting comparison to the traditional technologies we have discussed and containers, can be downloaded from the following URL:
Less than a year after the IBM research report was published, Docker introduced plugins for its ecosystem. One of the best descriptions I came across was from a Docker software engineer, Jessica Frazelle, who described the release as having batteries included, but replaceable, meaning that the core functionality can be easily replaced with third-party tools that can then be used to address the conclusions of the IBM research report.
At the time of writing this book, Docker currently supports volume and network driver plugins. Additional plugin types to expose more of the Docker core to third parties will be added in the future.
Using the example from the previous section, let's launch the official PHP 5.6 container and then replace it with the official PHP 7.0 one.
In the following chapter, we will be getting into bootstrapping our Docker environments using Docker Machine; however, for now, let's perform a quick installation of Docker on a cloud server.
The following instructions will work on Ubuntu 14.04 LTS or CentOS 7 instances hosted on any of the public clouds, such as the following:
You can also try a local virtual machine running locally using the follow
I am going to be using a CentOS 7 server hosted in Digital Ocean as it is convenient to quickly launch a machine and then terminate it.
Once you have your server up and running, you can install Docker from the official Yum or APT repositories by running the following command:
curl -sSL https://get.docker.com/ | sh
If, like me, you are running a CentOS 7 server, you will need to ensure that the service is running. To do this, type the following command:
systemctl start docker
Once installed, you should be able to check whether everything worked as expected by running the Docker
hello-world container by entering the following command:
docker run hello-world
Once you have Docker installed and confirmed that it runs as expected, you can download the latest builds of the official PHP 5.6 and PHP 7.0 images by running the following command:
docker pull php:5.6-apache && docker pull php:7.0-apache
For more information on the official PHP images, refer to the Docker Hub page at https://hub.docker.com/_/php/.
Now that we have the images downloaded, it's time to deploy our application as we are keeping it really simple; all we going to be deploying is a
phpinfo page, this will confirm the version of PHP we are running along with details on the rest of the containers environment:
mkdir app1 && cd app1 echo "<?php phpinfo(); ?>" > index.php
Now the index.php file is in place. Let's start the PHP 5.6 container by running the following command:
docker run --name app1 -d -p 80:80 -it -v "$PWD":/var/www/html php:5.6-apache
This will have launch an
app1 container. If you enter the IP address of your server instance or a domain which resolves to, you should see a page that shows that you are running PHP 5.6:
Now that you have PHP 5.6 up and running, let's upgrade it to PHP 7. Traditionally, this would mean installing a new set of packages using either third-party YUM or APT repositories; speaking from experience, this process can be a little hit and miss, depending on the compatibility with the packages for the previous versions of PHP that you have installed.
Luckily in our case, we are using Docker, so all we have to do is terminate our PHP 5.6 container and replace with one running PHP 7. At any time during this process, you can check the containers that are running using the following command:
This will print a list of the running containers to the screen (as seen in the screenshot at the end of this section). To stop and remove the PHP 5.6 container, run the following command:
docker rm -f app1
Once the container has terminated, run the following command to launch a PHP 7 container:
docker run --name app1 -d -p 80:80 -it -v "$PWD":/var/www/html php:7.0-apache
If you return to the
phpinfo page in your browser, you will see that it is now running PHP 7:
docker rm -f app1
A full copy of the preceding terminal session can be found in the following screenshot:
This example probably shows the biggest advantage of Docker, being able to quickly and consistently launch containers on top of code bases that are stored on your local storage. There are, however, some limits.
So, in the previous example, we launched two containers, each running different versions of PHP on top of our (extremely simple) codebase. While it demonstrated how simple it is to launch containers, it also exposed some potential problems and single points of failure.
To start with, our codebase is stored on the host machines filesystem, which means that we can only run the container on our single-host machine. What if it goes down for any reason?
There are a few ways we could get around this with a vanilla Docker installation. The first is use the official PHP container as a base to build our own custom image so that we can ship our code along with PHP. To do this, add
Dockerfile to the
app1 directory that contains the following content:
### Dockerfile FROM php:5.6-apache MAINTAINER Russ McKendrick <email@example.com> ADD index.php /var/www/html/index.php
We can also build our custom image using the following command:
docker build -t app1:php-5.6 .
When you run the build command, you will see the following output:
Once you have your image built, you could push it as a private image to the Docker Hub or your own self-hosted private registry; another option is to export the custom image as a
.tar file and then copy it to each of the instances that need to run your custom PHP container.
docker save app1:php-5.6 > ~/app1-php-56.tar
This will make a copy of our custom image, as you can see from the following terminal output, the image should be around a
482M tar file:
Now that we have a copy of the image as a tar file, we can copy it to our other host machines. Once you have copied the tar file, you will need to run the Docker load command to import it onto our second host:
docker load < ~/app1-php-56.tar
Then we can launch a container that has our code baked in by running the following command:
docker run --name app1 -d -p 80:80 -it app1:php-5.6
The following terminal output gives you an idea of what you should see when importing and running our custom container:
So far so good? Well, yes and no.
The official Docker Hub
Our own private registry
Exporting the image as a tar file and copying it across our other hosts
However, what about containers that are processing data that is changing all the time, such as a database? What are our options for a database?
Consider that we are running the official MySQL container from https://hub.docker.com/_/mysql/, we could mount the folder where our databases are stored (that is,
/var/lib/mysql/) from the host machine, but that could cause us permissions issues with the files once they are mounted within the container.
To get around this, we could create a data volume that contains a copy of our
/var/lib/mysql/ directory, this means that we are keeping our databases separate from our container so that we can stop, start, and even replace the MySQL container without destroying our data.
This approach, however, binds us to running our MySQL container on a single host, which is a big single point of failure.
If we have the resources available, we could make sure that the host where we are hosting our MySQL container has multiple redundancies, such as a number of hard drives in RAID configuration that allows us to weather more than one drive failure. We can have multiple power supply units (PSU) being fed by different power feeds, so if we have any problems with the power from one of our feeds, the host machine stays online.
We can also have the same with the networking on the host machine, NICs plugged into different switches being fed by different power feeds and network providers.
While this does leave us with a lot of redundancy, we are still left with a single host machine, which is now getting quite expensive as all of this redundancy with multiple drives, networking, and power feeds are additional costs on top of what we are already paying for our host machine.
So, what's the solution?
This is where extending Docker comes in, while Docker, out of the box, does not support the moving of volumes between host servers, we can plug in a filesystem extension that allows us to migrate volumes between hosts or mount a volume from a shared filesystem, such as NFS.
If we have this in place for our MySQL container, should there be a problem with the host machine, there will be no problem for us as the data volume can be mounted on another host.
Once we have the volume mounted, it can carry on where it left off, as we have our data on a volume that is being replicated to the new host or is accessible via a filesystem share from some redundant storage, such as a SAN.
The same can also be said for networking. As mentioned in the summary of the IBM research report, Docker NAT-based networking could be a bottleneck when it comes to performance, as well as designing your container infrastructure. If it is a problem, then you can add a networking extension and offload your containers network to a software-defined network (SDN) rather than have the core of Docker manage the networking using NAT and bridged interfaces within iptables on the host machine.
Once you introduce this level of functionality to the core of Docker, it can get difficult to manage your containers. In an ideal world, you shouldn't have to worry about which host your container is running on or if your container/host machine stops responding for any reason, then your containers will not automatically pop up on another host somewhere within your container network and carry on where it left off.
In the following chapters of this book, we will be looking at how to achieve some of the concepts that we have discussed in this chapter, and we will look at tools written by Docker, designed to run alongside the core Docker engine. While these tools may not be as functional as some of the tools we will be looking at in the later chapters, they serve as a good introduction to some of the core concepts that we will be covering when it comes to creating clusters of Docker hosts and then orchestrating your containers.
Once we have looked at these tools, we will look at volume and networking plugins. We will cover a few of the more well-known plugins that add functionality to the Docker core that allows us to have a more redundant platform.
Once we have been hands-on with pre-written plugins, we will look at the best way to approach writing your own plugin.
In the final chapters of the book, we will start to look at third-party tools that allow you to configure, deploy, and manage the whole life cycle of your containers.
In this chapter, we have looked at Docker and some of the problems it solves. We have also discussed some of the ways in which the core Docker engine can be extended and the problems that you can solve with the additional functionality that you gain by extending Docker.
In the next chapter, we will look at four different tools provided by Docker to make deploying, managing, and configuring Docker host instances and containers as simple and seamless as possible.