Chapter 1: Docker Overview
Welcome to Mastering Docker, Fourth Edition! This first chapter will cover the Docker basics that you should already have a pretty good handle on. But if you don't already have the required knowledge at this point, this chapter will help you get up to speed, so that subsequent chapters don't feel as heavy.
By the end of the book, you will be a Docker master able to implement Docker in your environments, building and supporting applications on top of them.
In this chapter, we're going to review the following:
- Understanding Docker
- The differences between dedicated hosts, virtual machines, and Docker installers/installation
- The Docker command
- The Docker and container ecosystem
In this chapter, we are going to discuss how to install Docker locally. To do this, you will need a host running one of the three following operating systems:
- macOS High Sierra and above
- Windows 10 Professional
- Ubuntu 18.04 and above
- Check out the following video to see the Code in Action: https://bit.ly/35fytE3
The company behind Docker, also called Docker, has always described the program as fixing the 'it works on my machine' problem. This problem is best summed up by an image, based on the Disaster Girl meme, which simply had the tagline 'Worked fine in dev, ops problem now', that started popping up in presentations, forums, and Slack channels a few years ago. While it is funny, it is, unfortunately, an all-too-real problem and one I have personally been on the receiving end of, let's take a look at an example of what is meant by this.
For example, a developer using the macOS version of, say, PHP will probably not be running the same version as the Linux server that hosts the production code. Even if the versions match, you then have to deal with differences in the configuration and overall environment on which the version of PHP is running, such as differences in the way file permissions are handled between different operating system versions, to name just one potential problem.
All of this comes to a head when it is time for a developer to deploy their code to the host, and it doesn't work. So, should the production environment be configured to match the developer's machine, or should developers only do their work in environments that match those used in production?
In an ideal world, everything should be consistent, from the developer's laptop all the way through to your production servers; however, this utopia has traditionally been challenging to achieve. Everyone has their way of working and their own personal preferences—enforcing consistency across multiple platforms is difficult enough when a single engineer is working on the systems, let alone a team of engineers working with a team of potentially hundreds of developers.
The Docker solution
Using Docker for Mac or Docker for Windows, a developer can quickly wrap their code in a container that they have either defined themselves or created as a Dockerfile while working alongside a sysadmin or operations team. We will be covering this in Chapter 2, Building Container Images, as well as Docker Compose files, which we will go into more detail about in Chapter 5, Docker Compose.
Programmers can continue to use their chosen integrated development environment (IDE) and maintain their workflows when working with the code. As we will see in the upcoming sections of this chapter, installing and using Docker is not difficult; considering how much of a chore it was to maintain consistent environments in the past, even with automation, Docker feels a little too easy – almost like cheating.
Let's say you are looking after five servers: three load-balanced web servers and two database servers that are in a master or slave configuration dedicated to running Application 1. You are using a tool, such as Puppet or Chef, to automatically manage the software stack and configuration across your five servers.
Everything is going great until you are told that we need to deploy Application 2 on the same servers that are running Application 1. On the face of it, this is not a problem – you can tweak your Puppet or Chef configuration to add new users, add virtual hosts, pull the latest code down, and so on. However, you notice that Application 2 requires a newer version of the software than the one you are running for Application 1.
To make matters worse, you already know that Application 1 flat out refuses to work with the new software stack and that Application 2 is not backward compatible.
Traditionally, this leaves you with a few choices, all of which just add to the problem in one way or another:
- Ask for more servers? While this tradition is probably the safest technical solution, it does not automatically mean that there will be the budget for additional resources.
- Re-architect the solution? Taking one of the web and database servers out of the load balancer or replication and redeploying them with the software stack for Application 2 may seem like the next easiest option from a technical point of view. However, you are introducing single points of failure for Application 2 and reducing the redundancy for Application 1 as well: there was probably a reason why you were running three web and two database servers in the first place.
- Attempt to install the new software stack side-by-side on your servers? Well, this certainly is possible and may seem like a good short-term plan to get the project out of the door, but it could leave you with a house of cards that could come tumbling down when the first critical security patch is needed for either software stack.
The Docker solution
This is where Docker starts to come into its own. If you have Application 1 running across your three web servers in containers, you may be running more than three containers; in fact, you could already be running six, doubling up on the containers, allowing you to run rolling deployments of your application without reducing the availability of Application 1.
Deploying Application 2 in this environment is as easy as merely launching more containers across your three hosts and then routing to the newly deployed application using your load balancer. As you are just deploying containers, you do not need to worry about the logistics of deploying, configuring, and managing two versions of the same software stack on the same server.
We will work through an example of this exact scenario in Chapter 5, Docker Compose.
Enterprises suffer from the same problems faced by developers and operators, as they employ both types of profession; however, they have both of these entities on a much larger scale, and there is also a lot more risk involved.
Because of the risk as well as the fact that any downtime could cost sales or impact reputation, enterprises need to test every deployment before it is released. This means that new features and fixes are stuck in a holding pattern while the following takes place:
- Test environments are spun up and configured.
- Applications are deployed across the newly launched environments.
- Test plans are executed, and the application and configuration are tweaked until the tests pass.
- Requests for change are written, submitted, and discussed to get the updated application deployed to production.
This process can take anywhere from a few days to a few weeks, or even months, depending on the complexity of the application and the risk the change introduces. While the process is required to ensure continuity and availability for the enterprise at a technological level, it does potentially add risk at the business level. What if you have a new feature stuck in this holding pattern and a competitor releases a similar—or worse still—the same functionality, ahead of you?
This scenario could be just as damaging to sales and reputation as the downtime that the process was put in place to protect you against in the first place.
The Docker solution
Docker does not remove the need for a process, such as the one just described, to exist or be followed. However, as we have already touched upon, it does make things a lot easier as you are already working consistently. It means that your developers have been working with the same container configuration that is running in production. This means that it is not much of a step for the methodology to be applied to your testing.
For example, when a developer checks their code that they know works on their local development environment (as that is where they have been doing all of their work), your testing tool can launch the same containers to run your automated tests against. Once the containers have been used, they can be removed to free up resources for the next lot of tests. This means that suddenly, your testing process and procedures are a lot more flexible, and you can continue to reuse the same environment, rather than redeploying or re-imaging servers for the next set of testing.
This streamlining of the process can be taken as far as having your new application containers push through to production.
So, we know what problems Docker was developed to solve. We now need to discuss what exactly Docker is and what it does.
The differences between dedicated hosts, virtual machines, and Docker
Docker is a container management system that helps us efficiently manage Linux Containers (LXC) more easily and universally. This lets you create images in virtual environments on your laptop and run commands against them. The actions you perform to the containers, running in these environments locally on your machine, will be the same commands or operations that you run against them when they are running in your production environment.
This helps us in that you don't have to do things differently when you go from a development environment, such as the one on your local machine, to a production environment on your server. Now, let's take a look at the differences between Docker containers and typical virtual machine environments:
As you can see, for a dedicated machine, we have three applications, all sharing the same orange software stack. Running virtual machines allows us to run three applications, running two completely different software stacks. The following diagram shows the same three applications running in containers using Docker:
This diagram gives us a lot of insight into the most significant key benefit of Docker, that is, there is no need for a complete operating system every time we need to bring up a new container, which cuts down on the overall size of containers. Since almost all the versions of Linux use the standard kernel models, Docker relies on using the host operating system's Linux kernel for the operating system it was built upon, such as Red Hat, CentOS, and Ubuntu.
For this reason, you can have almost any Linux operating system as your host operating system and be able to layer other Linux-based operating systems on top of the host. Well, that is, your applications are led to believe that a full operating system is actually installed—but in reality, we only install the binaries, such as a package manager and, for example, Apache/PHP and the libraries required to get just enough of an operating system for your applications to run.
For example, in the earlier diagram, we could have Red Hat running for the orange application, and Debian running for the green application, but there would never be a need actually to install Red Hat or Debian on the host. Thus, another benefit of Docker is the size of images when they are created. They are built without the most significant piece: the kernel or the operating system. This makes them incredibly small, compact, and easy to ship.
Installers are one of the first pieces of software you need to get up and running with Docker on both your local machine and your server environments. Let's first take a look at which environments you can install Docker in:
- Linux (various Linux flavors)
- Windows 10 Professional
Besides, you can run them on public clouds, such as Amazon Web Services, Microsoft Azure, and DigitalOcean, to name a few. With each of these installers listed previously, Docker actually operates in different ways on the operating system. For example, Docker runs natively on Linux. However, if you are using macOS or Windows 10, then it operates a little differently since it relies on using Linux.
Let's look at quickly installing Docker on a Linux desktop running Ubuntu 18.04, and then on macOS and Windows 10.
Installing Docker on Linux
As already mentioned, this is the most straightforward installation out of the three systems we will be looking at. We'll be installing Docker on Ubuntu 18.04; however, there are various flavors of Linux with their own package managers, which will handle this slightly differently. See the Further reading section for details on install on other Linux distributions. To install Docker, simply run the following command from a Terminal session:
$ curl -sSL https://get.docker.com/ | sh $ sudo systemctl start docker
You will also be asked to add your current user to the Docker group. To do this, run the following command, making sure you replace the username with your own:
$ sudo usermod -aG docker username
These commands will download, install, and configure the latest version of Docker from Docker themselves. At the time of writing, the Linux operating system version installed by the official install script is 19.03.
Running the following command should confirm that Docker is installed and running:
$ docker version
There is a supporting tool that we are going to use in future chapters, which are installed as part of the Docker for macOS or Windows 10 installers.
To ensure that we are ready to use the tool in later chapters, we should install it now. The tool is called Docker Compose, and to install it we first need to get the latest version number. You can find this by visiting the releases section of the project's GitHub page at https://github.com/docker/compose/releases/. At the time of writing, the version was
1.25.4 – update the version number in the commands in the following code block with whatever the latest version is when you install it:
$ COMPOSEVERSION=1.25.4 $ curl -L https://github.com/docker/compose/releases/ download/$COMPOSEVERSION/docker-compose-`uname -s`-`uname -m` >/tmp/docker-compose $ chmod +x /tmp/docker-compose $ sudo mv /tmp/docker-compose /usr/local/bin/docker-compose
$ docker-compose version
Now that we know how to install it on Linux, let's look at how we can install it on macOS.
Installing Docker on macOS
Before downloading, you should make sure that you are running at least Apple macOS X Yosemite 10.10.3 as this is minimum OS requirement to run the version of Docker we will be discussing in this title. If you are running an older version, all is not lost; you can still run Docker. Refer to the Older operating systems section of this chapter.
Let's install Docker on macOS:
- Go to the Docker store at https://hub.docker.com/editions/community/docker-ce-desktop-mac.
- Click on the Get Docker link.
- Once it's downloaded, you should have a
DMGfile. Double-clicking on it will mount the image, and opening the image mounted on your desktop should present you with something like this:
- Once you have dragged the Docker icon to your Applications folder, double-click on it and you will be asked whether you want to open the application you have downloaded.
- Clicking Yes will open the Docker installer, showing the following prompt:
- Clicking on OK will bring up a dialogue that asks for your password. Once the password is entered, you should see a Docker icon in the top-left icon bar on your screen.
- Clicking on the icon and selecting About Docker should show you something similar to the following:
- You can also run the following commands to check the version of Docker Compose that were installed alongside Docker Engine on the command line:
$ docker-compose version
Now that we know how to install Docker on macOS, let's move on to our final operating system, Windows 10 Professional.
Installing Docker on Windows 10 Professional
Before downloading, you should make sure that you are running Microsoft Windows 10 Professional or Enterprise 64-bit. If you are running an older version or an unsupported edition of Windows 10, you can still run Docker; refer to the Older operating systems section of this chapter for more information. Docker for Windows has this requirement due to its reliance on Hyper-V. Hyper-V is Windows' native hypervisor and allows you to run x86-64 guests on your Windows machine, be it Windows 10 Professional or Windows Server. It even forms part of the Xbox One operating system.
Let's install Docker for Windows:
- Download the Docker for Windows installer from the Docker store at https://hub.docker.com/editions/community/docker-ce-desktop-windows.
- Click on the Get Docker button to download the installer.
- Once it's downloaded, run the installer package, and you will be greeted with the following:
- Leave the configuration at the default values and then click on OK. This will trigger an installation of all of the components needed to run Docker on Windows:
- Once it's installed, you will be prompted to restart. To do this, simply click on the Close and restart button:
- Once your machine has restarted, you should see a Docker icon in the icon tray in the bottom right of your screen. Clicking on it and selecting About Docker from the menu will show the following:
- Open a PowerShell window and type the following command:
$ docker version
This should also show you similar output to the Mac and Linux versions:
$ docker-compose version
You should see a similar output to the macOS and Linux versions. As you may have started to gather, once the packages are installed, their usage is going to be pretty similar. You will be able to see this when we get to the Using Docker commands section of this chapter.
Older operating systems
$ docker version
On all three of the installations we have performed so far, it shows two different versions, a client and a server. Predictably, the Linux version shows that the architecture for the client and server are both Linux; however, you may notice that the Mac version shows the client is running on Darwin, which is Apple's Unix-like kernel, and the Windows version shows Windows. Yet both of the servers show the architecture as being Linux, so what gives?
That is because both the Mac and Windows versions of Docker download and run a virtual machine in the background, and this virtual machine runs a small, lightweight operating system based on Alpine Linux. The virtual machine runs using Docker's libraries, which connect to the built-in hypervisor for your chosen environment.
For macOS, this is the built-in Hypervisor.framework, and for Windows, as we have already mentioned, it is Hyper-V.
To ensure that no one misses out on the Docker experience, a version of Docker that does not use these built-in hypervisors is available for older versions of macOS and unsupported Windows versions. These versions utilize VirtualBox as the hypervisor to run the Linux server for your local client to connect to.
VirtualBox is an open source x86 and AMD64/Intel64 virtualization product developed by Oracle. It runs on Windows, Linux, Macintosh, and Solaris hosts, with support for many Linux, Unix, and Windows guest operating systems.
For more information on Docker Toolbox, see the project's website at https://github.com/docker/toolbox/, where you can also download the macOS and Windows installers from the releases page.
This book assumes that you have installed the latest Docker version on Linux or have used Docker for Mac or Docker for Windows. While Docker installations using Docker Toolbox should be able to support the commands in this book, you may run into issues around file permissions and ownership when mounting data from your local machine to your containers.
Using Docker commands
You should already be familiar with these Docker commands. However, it's worth going through them to ensure you know all. We will start with some common commands and then take a peek at the commands that are used for the Docker images. We will then take a dive into the commands that are used for the containers.
A while ago, Docker restructured their command-line client into more logical groupings of commands, as the number of features provided by the client multiplies and commands start to cross over each other. Throughout this book, we will be using this structure rather than some of the shorthand that still exists within the client.
The first command we will be taking a look at is one of the most useful commands, not only in Docker but in any command-line utility you use – the
help command. It is run simply like this:
$ docker help
This command will give you a full list of all of the Docker commands at your disposal, along with a brief description of what each command does. We will be looking at this in more detail in Chapter 4, Managing Containers. For further help with a particular command, you can run the following:
$ docker <COMMAND> --help
Next, let's run the
hello-world container. To do this, simply run the following command:
$ docker container run hello-world
It doesn't matter what host you are running Docker on, the same thing will happen on Linux, macOS, and Windows. Docker will download the
hello-world container image and then execute it, and once it's executed, the container will be stopped.
Let's try something a little more adventurous – let's download and run an NGINX container by running the following two commands:
$ docker image pull nginx $ docker container run -d --name nginx-test -p 8080:80 nginx
NGINX is an open source web server that can be used as a load balancer, mail proxy, reverse proxy, and even an HTTP cache.
The first of the two commands downloads the NGINX container image, and the second command launches a container in the background called
nginx-test, using the
nginx image we pulled. It also maps port
8080 on our host machine to port
80 on the container, making it accessible to our local browser at http://localhost:8080/.
You may notice that the Linux and macOS screens at first glance look similar. That is because I am using a remote Linux server, and we will look more at how to do this in a later chapter.
This is the result on macOS:
$ docker container stop nginx-test $ docker container rm nginx-test
As you can see, the experience of running a simple NGINX container on all three of the hosts on which we have installed Docker is exactly the same. As am I sure you can imagine, trying to achieve this without something like Docker across all three platforms is a challenge, and a very different experience on each platform too. Traditionally, this has been one of the reasons for the difference in local development environments as people would need to download a platform-specific installer and configure the service for the platform they are running. Also, in some cases there could be feature differences between the platforms.
Docker and the container ecosystem
If you have been following the rise of Docker and containers, you will have noticed that, throughout the last few years, the messaging on the Docker website has been slowly changing from headlines about what containers are to more of a focus on the services provided by Docker as a company.
One of the core drivers for this is that everything has traditionally been lumped into being known just as 'Docker,' which can get confusing. Now that people did not need educating as much on what a container is or the problems they can solve with Docker, the company needed to try and start to differentiate themselves from competitors that sprung up to support all sorts of container technologies.
So, let's try and unpack everything that is Docker, which involves the following:
- Open source projects: There are several open source projects started by Docker, which are now maintained by a large community of developers.
- Docker, Inc.: This is the company founded to support and develop the core Docker tools.
- Docker CE and Docker EE: This is the core collection of Docker tools built on top of the open source components.
We will also be looking at some third-party services in later chapters. In the meantime, let's go into more detail on each of these, starting with the open source projects.
Open source projects
- Moby Project is the upstream project upon which the Docker Engine is based. It provides all of the components needed to assemble a fully functional container system.
- Runc is a command-line interface for creating and configuring containers and has been built to the OCI specification.
- Containerd is an easily embeddable container runtime. It is also a core component of the Moby Project.
- LibNetwork is a Go library that provides networking for containers. Notary is a client and server that aims to provide a trust system for signed container images.
- HyperKit is a toolkit that allows you to embed hypervisor capabilities into your own applications; presently, it only supports the macOS and the Hypervisor framework.
- VPNKit provides VPN functionality to HyperKit.
- DataKit allows you to orchestrate application data using a Git-like workflow.
- SwarmKit is a toolkit that enables you to build distributed systems using the same raft consensus algorithm as Docker Swarm.
- LinuxKit is a framework that allows you to develop and compile a small portable Linux operating system for running containers.
- InfraKit is a collection of tools that you can use to define the infrastructure to run your LinuxKit generated distributions on.
On their own, you will probably never use the individual components; however, each of the projects mentioned is a component of the tools that are maintained by Docker, Inc. We will go a little more into these projects in our final chapter.
Docker, Inc. is the company formed to initially develop Docker Community Edition (Docker CE) and Docker Enterprise Edition (Docker EE). It also used to provide an SLA-based support service for Docker EE as well as offering consulting services to companies who wish to take their existing applications and containerize them as part of Docker's Modernise Traditional Apps (MTA) program.
You will notice that I referred to a lot of the things in the previous sentence in the past tense. This is because in November 2019 Docker, Inc. restructured and sold its platform business to a company called Mirantis Inc. They acquired the following assets from Docker, Inc.:
- Docker Enterprise, including Docker EE
- Docker Trusted Registry
- Docker Unified Control Plane
- Docker CLI
Mirantis Inc. is a California-based company that focuses on the development and support of OpenStack- and Kubernetes-based solutions. It was one of the founders of the non-profit corporate entity OpenStack Foundation and had a vast amount of experience of providing enterprise-level support.
Former Docker, Inc. CEO Rob Bearden, who stepped down shortly after the announcement, was quoted as saying:
With the Enterprise business now with Mirantis Inc., Docker, Inc. is focusing on providing better developer workflows with Docker Desktop and Docker Hub, which allows users to avoid the threat of vendor lock-in.
Docker CE and Docker EE
There are a lot of tools supplied and supported by Docker, Inc. Some we have already mentioned, and others we will cover in later chapters. Before we finish this, our first chapter, we should get an idea of the tools we are going to be using. The most of important of them is the core Docker Engine.
This is the core of Docker, and all of the other tools that we will be covering use it. We have already been using it as we installed it in the Docker installation and Docker commands sections of this chapter. There are currently two versions of Docker Engine; there is Docker EE, which is now maintained by Mirantis Inc., and Docker CE. We will be using Docker CE throughout this book.
As well as the stable version of Docker CE, Docker will be providing nightly builds of the Docker Engine via a nightly repository (formally Docker CE Edge), and monthly builds of Docker for Mac and Docker for Windows via the Edge channel.
There are also the following tools:
- Docker Compose: A tool that allows you to define and share multi-container definitions; it is detailed in Chapter 5, Docker Compose.
- Docker Machine: A tool to launch Docker hosts on multiple platforms; we will cover this in Chapter 6, Managing Containers.
- Docker Hub: A repository for your Docker images, covered in the next three chapters.
- Docker Desktop (Mac): We have covered Docker for Mac in this chapter.
- Docker Desk/top (Windows): We have covered Docker for Windows in this chapter.
- Docker Swarm: A multi-host-aware orchestration tool, covered in detail in Chapter 8, Docker Swarm. Mirantis Inc now maintains this.
In this chapter, we covered some basic information that you should already know (or now know) for the chapters ahead. We went over the basics of what Docker is, and how it fares compared to other host types. We went over the installers, how they operate on different operating systems, and how to control them through the command line. Be sure to remember to look at the requirements for the installers to ensure you use the correct one for your operating system.
Then, we took a small dive into using Docker and issued a few basic commands to get you started. We will be looking at all of the management commands in future chapters to get a more in-depth understanding of what they are, as well as how and when to use them. Finally, we discussed the Docker ecosystem and the responsibilities of each of the different tools.
In the next chapter, we will be taking a look at how to build base containers, and we will also look in depth at Dockerfiles and places to store your images, as well as using environmental variables and Docker volumes.
- Where can you download Docker Desktop (Mac) and Docker Desktop (Windows) from?
- What command did we use to download the NGINX image?
- Which open source project is upstream for the core Docker Engine?
- Which company now maintains Docker Enterprise?
- Which command would you run to find out more information on the Docker container subset of commands?
These are the companies involved in maintaining Docker:
- Docker, Inc.: http://docker.com
- Mirantis Inc.: https://www.mirantis.com
- Docker restructure: https://www.computerweekly.com/news/252473956/Docker-restructure-sees-enterprise-platform-business-sold-to-open-source-cloud-firm-Mirantis
In this chapter, we have mentioned the following hypervisors:
- macOS Hypervisor framework: https://developer.apple.com/documentation/hypervisor
- Hyper-V: https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/quick-start/enable-hyper-v
For details on how to install on other Linux distributions, take a look at the Install Docker Engine page of the Docker docs: https://docs.docker.com/engine/install/.
We referenced the following blog posts from Docker:
- Docker CLI restructure blog post: https://www.docker.com/blog/whats-new-in-docker-1-13/
- Docker Extended Support Announcement: https://www.docker.com/blog/extending-support-cycle-docker-community-edition/
Next up, we discussed the following open source projects:
- Moby Project: https://mobyproject.org
- Runc: https://github.com/opencontainers/runc
- Containerd: https://containerd.io
- LibNetwork: https://github.com/moby/libnetwork
- Notary: https://github.com/theupdateframework/notary
- HyperKit: https://github.com/moby/hyperkit
- VPNKit: https://github.com/moby/vpnkit
- DataKit: https://github.com/moby/datakit