Initially, Internet services ran on hardware and life was okay. To scale services to handle peak capacity, one needed to buy enough hardware to handle the load. When the load was no longer needed, the hardware sat unused or underused but ready to serve. Unused hardware is a waste of resources. Also, there was always the threat of configuration drift because of the subtle changes we made with each new install.
Then came VMs and life was good. VMs could be scaled to the size that was needed and no more. Multiple VMs could be run on the same hardware. If there was an increase in demand, new VMs could be started on any physical server that had room. More work could be done on less hardware. Even better, new VMs could be started in minutes when needed, and destroyed when the load slackened. It was even possible to outsource the hardware to companies such as Amazon, Google, and Microsoft. Thus elastic computing was born.
VMs, too, had their problems. Each VM required that additional memory and storage space be allocated to support the operating system. In addition, each virtualization platform had its own way of doing things. Automation that worked with one system had to be completely retooled to work with another. Vendor lock-in became a problem.
Then came Docker. What VMs did for hardware, Docker does for the VM. Services can be started across multiple servers and even multiple providers. Once deployed, containers can be started in seconds without the resource overhead of a full VM. Even better, applications developed in Docker can be deployed exactly as they were built, minimizing the problems of configuration drift and package maintenance.
The question is: how does one do it? That process is called orchestration, and like an orchestra, there are a number of pieces needed to build a working cluster. In the following chapters, I will show a few ways of putting those pieces together to build scalable, reliable services with faster, more consistent deployments.
Let's go through a quick review of the basics so that we are all on the same page. The following topics will be covered:
How to install Docker Engine on Amazon Web Services (AWS), Google Compute Engine (GCE), Microsoft Azure, and a generic Linux host with
docker-machine
An introduction to Docker-specific distributions including CoreOS, RancherOS, and Project Atomic
Starting, stopping, and inspecting containers with Docker
Managing Docker images
Docker Engine is the process that actually runs and controls containers on each Docker host. It is the engine that makes your cluster work. It provides the daemon that runs and manages the containers, an API that the various tools use to interact with Docker, and a command-line interface.
Docker Engine is easy to install with a script provided by Docker. The Docker project recommends that you pipe the download through sh
:
I cannot state strongly enough how dangerous that practice is. If https://www.docker.com/ is compromised, the script that you download could compromise your systems. Instead, download the file locally and review it to ensure that you are comfortable with what the script is doing. After you have reviewed it, you could load it to a local web server for easy access or push it out with a configuration management tool such as Puppet, Chef, or Ansible:
After you have reviewed the script, run it:
If you are running a supported Linux distribution, the script will prepare your system and install Docker. Once installed, Docker will be updated by the local package system, such as apt
on Debian and Ubuntu or yum
on CentOS and Red Hat Enterprise Linux (RHEL). The install
command starts Docker and configures it to start on system boot.
Note
A list of supported operating systems, distributions, and cloud providers is located at https://docs.docker.com/engine/installation/ .
By default, anyone using Docker locally will need root privileges. You can change that by adding them to the docker
group which is created by the install packages. They will be able to use Docker without root, starting with their next login.
Docker provides a very nice tool to facilitate deployment and management of Docker hosts on various cloud services and Linux hosts called Docker Machine. Docker Machine is installed as part of the Docker Toolbox but can be installed separately. Full instructions can be found at https://github.com/docker/machine/releases/ .
Docker Machine supports many different cloud services including AWS, Microsoft Azure, and GCE. It can also be configured to connect to any existing supported Linux server. The driver docker-machine
uses is defined by the --driver
flag. Each driver has its own specific flags that control how docker-machine
works with the service.
AWS is a great way to run Docker hosts and docker-machine
makes it easy to start and manage them. You can use the Elastic Load Balancer (ELB) to send traffic to containers running on a specific host or load balance among multiple hosts.
First of all, you will need to get your access credentials from AWS. You can use them in a couple of ways. First, you can include them on the command line when you run docker-machine
:
Second, you can add them to ~/.aws/credentials
. Putting your credentials in a credential file means that you will not have to include them on the command line every time you use docker-machine
to work with AWS. It also keeps your credentials off of the command line and out of the process list. The following examples will assume that you have created a credentials file to keep from cluttering the command line:
A new Docker host is created with the create
subcommand. You can specify the region using the --amazonec2-region
flag. By default, the host will be started in the us-east-1
region. The last item on the command line is the name of the instance, in this case dm-aws-test
:

The command takes a couple of minutes to run but when it's complete, you have a fully-functional Docker host ready to run containers. The ls
subcommand will show you all the machines that docker-machine
knows about:
The machine's IP address is listed in the output of docker-machine ls
, but you can also get it by running docker-machine ip
. To start working with your new machine, set up your environment by running eval $(docker-machine env dm-aws-test)
. Now when you run Docker, it will talk to the instance running up on AWS. It is even possible to ssh
into the server using docker-machine
:
Once you are done with the instance, you can stop it with docker-machine stop
and remove it with docker-machine rm
:
Note
There are a number of options that can be passed to docker-machine create
including options to use a custom AMI, instance type, or volume size. Complete documentation is available at
https://docs.docker.com/machine/drivers/aws/
.
GCE is another big player in cloud computing. Their APIs make it very easy to start up new hosts running on Google's high power infrastructure. Google is an excellent choice to host your Docker hosts, especially if you are already using other Google Cloud services.
You will need to create a project in GCE for your containers. Authentication happens through Google Application Default Credentials (ADC). This means that authentication will happen automatically if you run docker-machine
from a host on GCE. If you are running docker-machine
from your own computer, you will need to authenticate using the gcloud
tool. The gcloud
tool requires Python 2.7 and can be downloaded from the following site:
https://cloud.google.com/sdk/
.
The gcloud
tool will open a web browser to authenticate using OAuth 2. Select your account then click Allow on the next page. You will be redirected to a page that shows that you have been authenticated. Now, on to the fun stuff:

It will take a few minutes to complete depending on the size of image you choose. When it is done, you will have a Docker host running on GCE. You can now use the ls
, ssh
, and ip
subcommands just like the preceding AWS. When you are done, run docker-machine stop
and docker-machine rm
to stop and remove the image.
Note
There are a number of options that can be passed to docker-machine
including options to set the zone, image, and machine time. Complete documentation is available at
https://docs.docker.com/machine/drivers/gce/
.
Microsoft is a relative newcomer to the cloud services game but they have built an impressive service. Azure underpins several large systems including Xbox Live.
Azure uses the subscription ID for authentication. You will be given an access code and directed to enter it at https://aka.ms/devicelogin . Select Continue, choose your account, then click on Accept. You can close the browser window when you are done:

Again, it will take some time to finish. Once done, you will be able to run containers on your new host. As always, you can manage your new host with docker-machine
. There is an important notice in the output when you remove a machine on Azure. It is worth making sure that everything does get cleaned up:
Note
There are many options for the Azure driver including options to choose the image, VM size, location, and even which ports need to be open on the host. For full documentation refer to https://docs.docker.com/machine/drivers/azure/ .
You can also use a generic driver of docker-machine
to install and manage Docker on an existing host running a supported Linux distribution. There are a couple of things to keep in mind. First, the host must already be running. Docker can be pre-installed. This can be useful if you are installing Docker as part of your host build process. Second, if Docker is running, it will be restarted. This means that any running containers will be stopped. Third, you need to have an existing SSH key pair.
The following command will use SSH to connect to the server specified by the --generic-ip-address
flag using the key identified by --generic-ssh-key
and the user set with --generic-ssh-user
. There are two important things to keep in mind for the SSH user. First, the user must be able to use sudo
without a password prompt. Second, the public key must be in the authorized_keys
file in the user's $HOME/.ssh/
directory:
This process will take a couple of minutes. It will be faster than the creates on cloud services that also have to provision the VM. Once it is complete, you can manage the host with docker-machine
and start running containers.
The only difference between the generic driver and the other cloud drivers is that the stop
subcommand does not work. This means that stopping a generic Docker host has to be done from the host.
Note
Full documentation can be found at https://docs.docker.com/machine/drivers/generic/ .