Home Cloud & Networking Docker Certified Associate (DCA): Exam Guide

Docker Certified Associate (DCA): Exam Guide

By Francisco Javier Ramírez Urea
books-svg-icon Book
eBook $29.99 $20.98
Print $43.99
Subscription $15.99 $10 p/m for three months
$10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
BUY NOW $10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
eBook $29.99 $20.98
Print $43.99
Subscription $15.99 $10 p/m for three months
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
  1. Free Chapter
    Modern Infrastructures and Applications with Docker
About this book
Developers have changed their deployment artifacts from application binaries to container images, and they now need to build container-based applications as containers are part of their new development workflow. This Docker book is designed to help you learn about the management and administrative tasks of the Containers as a Service (CaaS) platform. The book starts by getting you up and running with the key concepts of containers and microservices. You'll then cover different orchestration strategies and environments, along with exploring the Docker Enterprise platform. As you advance, the book will show you how to deploy secure, production-ready, container-based applications in Docker Enterprise environments. Later, you'll delve into each Docker Enterprise component and learn all about CaaS management. Throughout the book, you'll encounter important exam-specific topics, along with sample questions and detailed answers that will help you prepare effectively for the exam. By the end of this Docker containers book, you'll have learned how to efficiently deploy and manage container-based environments in production, and you will have the skills and knowledge you need to pass the DCA exam.
Publication date:
September 2020
Publisher
Packt
Pages
612
ISBN
9781839211898

 
Modern Infrastructures and Applications with Docker

Microservices and containers have probably been the most frequently mentioned buzzwords in recent years. These days, we can still hear about them at conferences across the globe. Although both terms are definitely related when talking about modern applications, they are not the same. In fact, we can execute microservices without containers and run big monolithic applications in containers. In the middle of the container world, there is a well-known word that comes to mind when we find ourselves talking about them – Docker.

This book is a guide to passing the Docker Certified Associate exam, which is a certification of knowledge pertaining to this technology. We will cover each topic needed to pass this exam. In this chapter, we will start with what microservices are and why they are important in modern applications. We will also cover how Docker manages the requirements of this application's logical components.

This chapter will guide you through Docker's main concepts and will give you a basic idea of the tools and resources provided to manage containers.

In this chapter, we will cover the following topics:

  • Understanding the evolution of applications
  • Infrastructures
  • Processes
  • Microservices and processes
  • What are containers?
  • Learning about the main concepts of containers
  • Docker components
  • Building, shipping, and running workflows
  • Windows containers
  • Customizing Docker
  • Docker security

Let's get started!

 

Technical requirements

In this chapter, we will learn about various Docker Engine concepts. We'll provide some labs at the end of this chapter that will help you understand and learn about the concepts shown. These labs can be run on your laptop or PC using the provided Vagrant standalone environment or any already deployed Docker host that you own. You can find additional information in this book's GitHub repository: https://github.com/PacktPublishing/Docker-Certified-Associate-DCA-Exam-Guide.git

Check out the following video to see the Code in Action:

"https://bit.ly/3jikiSl"

 

Understanding the evolution of applications

As we will probably read about on every IT medium, the concept of microservices is key in the development of new modern applications. Let's go back in time a little to see how applications have been developed over the years.

Monolithic applications are applications in which all components are combined into a single program that usually runs on a single platform. These applications were not designed with reusability in mind, nor any kind of modularity, for that matter. This means that every time a part of their code required an update, all the applications had to be involved in the process; for example, having to recompile all the application code in order for it to work. Of course, things were not so strict then.

Applications grew in number in terms of tasks and functionalities, with some of these tasks being distributed to other systems or even other smaller applications. However, the core components were kept immutable. We used this model of programming because running all application components together, on the same host, was better than trying to find some required information from other hosts. Network speed was insufficient in this regard, however. These applications were difficult to scale and difficult to upgrade. In fact, certain applications were locked to specific hardware and operating systems, which meant that developers needed to have the same hardware architectures at development stages to evolve applications.

We will discuss the infrastructure associated with these monolithic applications in the next section. The following diagram represents how the decoupling of tasks or functionalities has evolved from monolithic applications to Simple Object Access Protocol (SOAP) applications and the new paradigm of microservices:

In trying to achieve better application performance and decoupling components, we moved to three-tier architectures, based on a presentation tier, an application tier, and a data tier. This allowed different types of administrators and developers to be involved in application updates and upgrades. Each layer could be running on different hosts, but components only talked to one another inside the same application.

This model is still present in our data centers right now, separating frontends from application backends before reaching the database, where all the requisite data is stored. These components evolved to provide scalability, high availability, and management. On occasion, we had to include new middleware components to achieve these functionalities (thus adding to the final equation; for example, application servers, applications for distributed transactions, queueing, and load balancers). Updates and upgrades were easier, and we isolated components to focus our developers on those different application functionalities.

This model was extended and it got even better with the emergence of virtual machines in our data centers. We will cover how virtual machines have improved the application of this model in more detail in the next section.

As Linux systems have grown in popularity, the interaction between different components, and eventually different applications, has become a requirement. SOAP and other queueing message integration have helped applications and components exchange their information, and networking improvements in our data centers have allowed us to start distributing these elements in different nodes, or even locations.

Microservices are a step further to decoupling application components into smaller units. We usually define a microservice as a small unit of business functionality that we can develop and deploy standalone. With this definition, an application will be a compound of many microservices. Microservices are very light in terms of host resource usage, and this allows them to start and stop very quickly. Also, it allows us to move application health from a high availability concept to resilience, assuming that the process dies (this can be caused by problems or just a component code update) and we need to start a new one as quickly as possible to keep our main functionality healthy.

Microservices architecture comes with stateless in mind. This means that the microservice state should be managed outside of its own logic because we need to be able to run many replicas for our microservice (scale up or down) and run its content on all nodes of our environment, as required by our global load, for example. We decoupled the functionality from the infrastructure (we will see how far this concept of "run everywhere" can go in the next chapter).

Microservices provide the following features:

  • Managing an application in pieces allows us to substitute a component for a newer version or even a completely new functionality without losing application functionality.
  • Developers can focus on one particular application feature or functionality, and will just need to know how to interact with other, similar pieces.
  • Microservices interaction will usually be effected using standard HTTP/HTTPS API Representational State Transfer (REST) calls. The objective of RESTful systems is to increase the speed of performance, reliability, and the ability to scale.
  • Microservices are components that are prepared to have isolated life cycles. This means that one unhealthy component will not wholly affect application usage. We will provide resilience to each component, and an application will not have full outages.
  • Each microservice can be written in different programming languages, allowing us to choose the best one for maximum performance and portability.

Now that we have briefly reviewed the well-known application architectures that have developed over the years, let's take a look at the concept of modern applications.

A modern application has the following features:

  • The components will be based on microservices.
  • The application component's health will be based on resilience.
  • The component's states will be managed externally.
  • It will run everywhere.
  • It will be prepared for easy component updates.
  • Each application component will be able to run on its own but will provide a way to be consumed by other components.

Let's take a look.

 

Infrastructures

For every described application model that developers are using for their applications, we need to provide some aligned infrastructure architecture.

On monolithic applications, as we have seen, all application functionalities run together. In some cases, applications were built for a specific architecture, operating system, libraries, binary versions, and so on. This means that we need at least one hardware node for production and the same node architecture, and eventually resources, for development. If we add the previous environments to this equation, such as certification or preproduction for performance testing, for example, the number of nodes for each application would be very important in terms of physical space, resources, and money spent on an application.

For each application release, developers usually need to have a full production-like environment, meaning that only configurations will be different between environments. This is hard because when any operating system component or feature gets updated, changes must be replicated on all application environments. There are many tools to help us with these tasks, but it is not easy, and the cost of having almost-replicated environments is something to look at. And, on the other hand, node provision could take months because, in many cases, a new application release would mean having to buy new hardware.

Third-tier applications would usually be deployed on old infrastructures using application servers to allow application administrators to scale up components whenever possible and prioritize some components over others.

With virtual machines in our data centers, we were able to distribute host hardware resources between virtual nodes. This was a revolution in terms of node provision time and the costs of maintenance and licensing. Virtual machines worked very well on monolithic and third-tier applications, but application performance depends on the host shared resources that are applied to the virtual node. Deploying application components on different virtual nodes was a common use case because it allowed us to run these virtually everywhere. On the other hand, we were still dependent on operating system resources and releases, so building a new release was dependent on the operating system.

From a developer's perspective, having different environments for building components, testing them side by side, and certificating applications became very easy. However, these new infrastructure components needed new administrators and efforts to provide nodes for development and deployment. In fast-growing enterprises with many changes in their applications, this model helps significantly in providing tools and environments to developers. However, agility problems persist when new applications have to be created weekly or if we need to accomplish many releases/fixes per day. New provisioning tools such as Ansible or Puppet allowed virtualization administrators to provide these nodes faster than ever, but as infrastructures grew, management became complicated.

Local data centers were rendered obsolete and although it took time, infrastructure teams started to use computer cloud providers. They started with a couple of services, such as Infrastructure as a Service (IaaS), that allowed us to deploy virtual nodes on the cloud as if they were on our data center. With new networking speeds and reliability, it was easy to start deploying our applications everywhere, data centers started to get smaller, and applications began to run on distributed environments on different cloud providers. For easy automation, cloud providers prepared their infrastructure's API for us, allowing users to deploy virtual machines in minutes.

However, as many virtualization options appeared, other options based on Linux kernel features and its isolation models came into being, reclaiming some old projects from the past, such as chroot and jail environments (quite common on Berkeley Software Distribution (BSD) operating systems) or Solaris zones.

The concept of process containers is not new; in fact, it is more than 10 years old. Process containers were designed to isolate certain resources, such as CPU, memory, disk I/O, or the network, to a group of processes. This concept is what is now known as control groups (also known as cgroups).

This following diagram shows a rough timeline regarding the introduction of containers to enterprise environments:

A few years later, a container manager implementation was released to provide an easy way to control the usage of cgroups, while also integrating Linux namespaces. This project was named Linux Containers (LXC), is still available today, and was crucial for others in finding an easy way to improve process isolation usage.

In 2013, a new vision of how containers should run on our environments was introduced, providing an easy-to-use interface for containers. It started with an open source solution, and Solomon Hykes, among others, started what became known as Docker, Inc. They quickly provided a set of tools for running, creating, and sharing containers with the community. Docker, Inc. started to grow very rapidly as containers became increasingly popular.

Containers have been a great revolution for our applications and infrastructures and we are going to explore this area further as we progress.

 

Processes

A process is a way in which we can interact with an underlying operating system. We can describe a program as a set of coded instructions to execute on our system; a process will be that code in action. During process execution, it will use system resources, such as CPU and memory, and although it will run on its own environment, it can share information with another process that runs in parallel on the same system. Operating systems provide tools that allow us to manipulate the behavior of this process during execution.

Each process in a system is identified uniquely by what is called the process identifier. Parent-child relations between processes are created when a process calls a new one during its execution. The second process becomes a subprocess of the first one (this is its child process) and we will have information regarding this relationship with what is called the parent PID.

Processes run because a user or other process launched it. This allows the system to know who launched that action, and the owner of that process will be known by their user ID. Effective ownership of child processes is implicit when the main process uses impersonation to create them. New processes will use the main process designated user.

For interaction with the underlying system, each process runs with its own environment variables and we can also manipulate this environment with the built-in features of the operating system.

Processes can open, write, and close files as needed and use pointers to descriptors during execution for easy access to this filesystem's resources.

All processes running on a system are managed by operating system kernels and have also been scheduled on CPU by the kernel. The operating system kernel will be responsible for providing system resources to process and interact with system devices.

To summarize, we can say that the kernel is the part of the operating system that interfaces with host hardware, using different forms of isolation for operating system processes under the definition of kernel space. Other processes will run under the definition of user space. Kernel space has a higher priority for resources and manages user space.

These definitions are common to all modern operating systems and will be crucial in understanding containers. Now that we know how processes are identified and that there is isolation between the system and its users, we can move on to the next section and understand how containers match microservices programming.

 

Microservices and processes

So far, we have briefly reviewed a number of different application models (monolith, SOAP, and the new microservices architecture) and we have defined microservices as the minimum piece of software with functionality that we can build as a component for an application.

With this definition, we will associate a microservice with a process. This is the most common way of running microservices. A process with full functionality can be described as a microservice.

An application is composed of microservices, and hence processes, as expected. The interaction between them will usually be made using HTTP/HTTPS/API REST.

This is, of course, a definition, but we recommend this approach to ensure proper microservice health management.

 

What are containers?

So far, we have defined microservices and how processes fit in this model. As we saw previously, containers are related to process isolation. We will define a container as a process with all its requirements isolated with kernel features. This package-like object will contain all the code and its dependencies, libraries, binaries, and settings that are required to run our process. With this definition, it is easy to understand why containers are so popular in microservices environments, but, of course, we can execute microservices without containers. On the contrary, we can run containers with a full application, with many processes that don't need to be isolated from each other inside this package-like object.

In terms of multi-process containers, what is the difference between a virtual machine and containers? Let's review container features against virtual machines.

Containers are mainly based on cgroups and kernel namespaces.

Virtual machines, on the other hand, are based on hypervisor software. This software, which can run as part of the operating system in many cases, will provide sandboxed resources to the guest virtualized hardware that runs a virtual machine operating system. This means that each virtual machine will run its own operating system and allow us to execute different operating systems on the same hardware host. When virtual machines arrived, people started to use them as sandboxed environments for testing, but as hypervisors gained in maturity, data centers started to have virtual machines in production, and now this is common and standard practice in cloud providers (cloud providers currently offer hardware as a service, too).

In this schema, we're showing the different logic layers, beginning with the machine hardware. We will have many layers for executing a process inside virtual machines. Each virtual machine will have its own operating system and services, even if we are just running a single process:

Each virtual machine will get a portion of resources and guest operating systems, and the kernel will manage how they are shared among different running processes. Each virtual machine will execute its own kernel and the operating system running on top of those of the host. There is complete isolation between the guest operating systems because hypervisor software will keep them separated. On the other hand, there is an overhead associated with running multiple operating systems side by side and when microservices come to mind, this solution wastes numerous host resources. Just running the operating system will consume a lot of resources. Even the fastest hardware nodes with fast SSD disks require resources and time to start and stop virtual machines. As we have seen, microservices are just a process with complete functionality inside an application, so running the entire operating system for just a couple of processes doesn't seem like a good idea.

On each guest host, we need to configure everything needed for our microservice. This means access, users, configurations, networking, and more. In fact, we need administrators for these systems as if they were bare-metal nodes. This requires a significant amount of effort and is the reason why configuration management tools are so popular these days. Ansible, Puppet, Chef, and SaltStack, among others, help us to homogenize our environments. However, remember that developers need their own environments, too, so multiply these resources by all the required environments in the development pipeline.

How can we scale up on service peaks? Well, we have virtual machine templates and, currently, almost all hypervisors allow us to interact with them using the command line or their own administrative API implementations, so it is easy to copy or clone a node for scaling application components. But this will require double the resources remember that we will run another complete operating system with its own resources, filesystems, network, and so on. Virtual machines are not the perfect solution for elastic services (which can scale up and down, run everywhere, and are created on-demand in many cases).

Containers will share the same kernel because they are just isolated processes. We will just add a templated filesystem and resources (CPU, memory, disk I/O, network, and so on, and, in some cases, host devices) to a process. It will run sandboxed inside and will only use its defined environment. As a result, containers are lightweight and start and stop as fast as their main processes. In fact, containers are as lightweight as the processes they run, since we don't have anything else running inside a container. All the resources that are consumed by a container are process-related. This is great in terms of hardware resource allocation. We can find out the real consumption of our application by observing the load of all of its microservices.

Containers are a perfect solution for microservices as they will run only one process inside. This process should have all the required functionality for a specific task, as we described in terms of microservices.

Similar to virtual machines, there is the concept of a template for container creation called Image. Docker images are standard for many container runtimes. They ensure that all containers that are created from a container image will run with the same properties and features. In other words, this eliminates the it works on my computer! problem.

Docker containers improve security in our environments because they are secure by default. Kernel isolation and the kind of resources managed inside containers provide a secure environment during execution. There are many ways to improve this security further, as we will see in the following chapters. By default, containers will run with a limited set of system calls allowed.

This schema describes the main differences between running processes on different virtual machines and using containers:

Containers are faster to deploy and manage, lightweight, and secure by default. Because of their speed upon execution, containers are aligned with the concept of resilience. And because of the package-like environment, we can run containers everywhere. We only need a container runtime to execute deployments on any cloud provider, as we do on our data centers. The same concept will be applied to all development stages, so integration and performance tests can be run with confidence. If the previous tests were passed, since we are using the same artifact across all stages, we can ensure its execution in production.

In the following chapters, we will dive deep into Docker container components. For now, however, just think of a Docker container as a sandboxed process that runs in our system, isolated from all other running processes on the same host, based on a template named Docker Image.

 

Learning about the main concepts of containers

When talking about containers, we need to understand the main concepts behind the scenes. Let's decouple the container concept into different pieces and try to understand each one in turn.

Container runtime

The runtime for running containers will be the software and operating system features that make process execution and isolation possible.

Docker, Inc. provides a container runtime named Docker, based on open source projects sponsored by them and other well-known enterprises that empower container movement (Red Hat/IBM and Google, among many others). This container runtime comes packaged with other components and tools. We will analyze each one in detail in the Docker components section.

Images

We use images as templates for creating containers. Images will contain everything required by our process or processes to run correctly. These components can be binaries, libraries, configuration files, and so on that can be a part of operating system files or just components built by yourself for this application.

Images, like templates, are immutable. This means that they don't change between executions. Every time we use an image, we will get the same results. We will only change configuration and environment to manage the behavior of different processes between environments. Developers will create their application component template and they can be sure that if the application passed all the tests, it will work in production as expected. These features ensure faster workflows and less time to market.

Docker images are built up from a series of layers, and all these layers packaged together contain everything required for running our application process. All these layers are read-only and the changes are stored in the next upper layer during image creation. This way, each layer only has a set of differences from the layer before it.

Layers are packaged to allow ease of transport between different systems or environments, and they include meta-information about the required architecture to run (will it run on Linux or Windows, or does it require an ARM processor, for example?). Images include information about how the process should be run, which user will execute the main process, where persistent data will be stored, what ports your process will expose in order to communicate with other components or users, and more.

Images can be built with reproducible methods using Dockerfiles or store changes made on running containers to obtain a new image:

This was a quick review of images. Now, let's take a look at containers.

Containers

As we described earlier, a container is a process with all its requirements that runs separately from all the other processes running on the same host. Now that we know what templates are, we can say that containers are created using images as templates. In fact, a container adds a new read-write layer on top of image layers in order to store filesystem differences from these layers. The following diagram represents the different layers involved in container execution. As we can observe, the top layer is what we really call the container because it is read-write and allows changes to be stored on the host disk:

All image layers are read-only layers, which means all the changes are stored in the container's read-write layer. This means that all these changes will be lost when we remove a container from a host, but the image will remain until we remove it. Images are immutable and always remain unchanged.

This container behavior lets us run many containers using the same underlying image, and each one will store changes on its own read-write layer. The following diagram represents how different images will use the same image layers. All three containers are based on the same image:

There are different approaches to managing image layers when building and container layers on execution. Docker uses storage drivers to manage this content, on read-only layers and read-write ones. These drivers are operating system-dependent, but they all implement what is known as copy-on-write filesystems.

A storage driver (known as graph-driver) will manage how Docker will store and manage the interactions between layers. As we mentioned previously, there are different drivers integrations available, and Docker will choose the best one for your system, depending on your host's kernel and operating system. Overlay2 is the most common and preferred driver for Linux operating systems. Others, such as aufs, overlay, and btfs, among others, are also available, but keep in mind that overlay2 is recommended for production environments on modern operating systems.

Devicemapper is also a supported graph driver and it was very common on Red Hat environments before overlay2 was supported on modern operating system releases (Red Hat 7.6 and above). Devicemapper uses block devices for storing layers and can be deployed in observance of two different strategies: loopback-lvm (by default and only for testing purposes) and direct-lvm (requires additional block device pool configurations and is intended for production environments). This link provides the required steps for deploying: direct-lvm: https://docs.docker.com/storage/storagedriver/device-mapper-driver/

As you may have noticed, using copy-on-write filesystems will make containers very small in terms of disk space usage. All common files are shared between the same image-based containers. They just store differences from immutable files that are part of image layers. Consequently, container layers will be very small (of course, this depends on what you are storing on containers, but keep in mind that good containers are small). When an existing file in a container has to be modified (remember a file that comes from underlying layers), the storage driver will perform a copy operation to the container layer. This process is fast, but keep in mind that everything that is going to be changed on containers will follow this process. As a reference, don't use copy-on-write with heavy I/O operations, nor process logs.

Copy-on-write is a strategy for creating maximum efficiency and small layer-based filesystems. This storage strategy works by copying files between layers. When a layer needs to change a file from another underlaying layer, it will be copied to this top one. If it just needs read access, it will use it from underlying layers. This way, I/O access is minimized and the size of the layers is very small.

A common question that many people ask is whether containers are ephemeral. The short answer is no. In fact, containers are not ephemeral for a host. This means that when we create or run a container on that host, it will remain there until someone removes it. We can start a stopped container on the same host if it is not deleted yet. Everything that was inside this container before will be there, but it is not a good place to store process state because it is only local to that host. If we want to be able to run containers everywhere and use orchestration tools to manage their states, processes must use external resources to store their status.

As we'll see in later chapters, Swarm or Kubernetes will manage service or application component status and, if a required container fails, it will create a new container. Orchestration will create a new container instead of reusing the old one because, in many cases, this new process will be executed elsewhere in the clustered pool of hosts. So, it is important to understand that your application components that will run as containers must be logically ephemeral and that their status should be managed outside containers (database, external filesystem, inform other services, and so on).

The same concept will be applied in terms of networking. Usually, you will let a container runtime or orchestrator manage container IP addresses for simplicity and dynamism. Unless strictly necessary, don't use fixed IP addresses, and let internal IPAMs configure them for you.

Networking in containers is based on host bridge interfaces and firewall-level NAT rules. A Docker container runtime will manage the creation of virtual interfaces for containers and process isolation between different logical networks creating mentioned rules. We will see all the network options provided and their use cases in Chapter 4, Container Persistency and Networking. In addition, publishing an application is managed by the runtime and orchestration will add different properties and many other options.

Using volumes will let us manage the interaction between the process and the container filesystem. Volumes will bypass the copy-on-write filesystem and hence writing will be much faster. In addition to this, data stored in a volume will not follow the container life cycle. This means that even if we delete the container that was using that volume, all the data that was stored there will remain until someone deletes it. We can define a volume as the mechanism we will use to persist data between containers. We will learn that volumes are an easy way to share data between containers and deploy applications that need to persist their data during the life of the application (for example, databases or static content). Using volumes will not increase container layer size, but using them locally will require additional host disk resources under the Docker filesystem/directory tree.

Process isolation

As we mentioned previously, a kernel provides namespaces for process isolation. Let's review what each namespace provides. Each container runs with its own kernel namespaces for the following:

  • Processes: The main process will be the parent of all other ones within the container.
  • Network: Each container will get its own network stack with its own interfaces and IP addresses and will use host interfaces.
  • Users: We will be able to map container user IDs with different host user IDs.
  • IPC: Each container will have its own shared memory, semaphores, and message queues without conflicting other processes on the host.
  • Mounts: Each container will have its own root filesystem and we can provide external mounts, which we will learn about in upcoming chapters.
  • UTS: Each container will get its own hostname and time will be synced with the host.

The following diagram represents a process tree from the host perspective and inside a container. Processes inside a container are namespaced and, as a result, their parent PID will be the main process, with its own PID of 1:

Namespaces have been available in Linux since version 2.6.26 (July 2008), and they provide the first level of isolation for a process running within a container so that it won't see others. This means they cannot affect other processes running on the host or in any other container. The maturity level of these kernel features allows us to trust in Docker namespace isolation implementation.

Networking is isolated too, as each container gets its own network stack, but communications will pass through host bridge interfaces. Every time we create a Docker network for containers, we will create a new network bridge, which we will learn more about in Chapter 4, Container Persistency and Networking. This means that containers sharing a network, which is a host bridge interface, will see one another, but all other containers running on a different interface will not have access to them. Orchestration will add different approaches to container runtime networking but, at the host level, described rules are applied.

Host resources available to a container are managed by control groups. This isolation will not allow a container to bring down a host by exhausting its resources. You should not allow containers with non-limited resources in production. This must be mandatory in multi-tenant environments.

Orchestration

This book contains a general chapter about orchestration, Chapter 7, Introduction to Orchestration, and two specific chapters devoted to Swarm and Kubernetes, respectively, Chapter 8, Orchestration Using Docker Swarm, and Chapter 9, Orchestration Using Kubernetes. Orchestration is the mechanism that will manage container interactions, publishing, and health in clustered pools of hosts. It will allow us to deploy an application based on many components or containers and keep it healthy during its entire life cycle. With orchestration, component updates are easy because it will take care of the required changes in the platform to accomplish a new, appropriate state.

Deploying an application using orchestration will require a number of instances for our process or processes, the expected state, and instructions for managing its life during execution. Orchestration will provide new objects, communication between containers running on different hosts, features for running containers on specific nodes within the cluster, and the mechanisms to keep the required number of process replicas alive with the desired release version.

Swarm is included inside Docker binaries and comes as standard. It is easy to deploy and manage. Its unit of deployment is known as a service. In a Swarm environment, we don't deploy containers because containers are not managed by orchestration. Instead, we deploy services and those services will be represented by tasks, which will run containers to maintain its state.

Currently, Kubernetes is the most widely used form of orchestration. It requires extra deployment effort using a Docker community container runtime. It adds many features, multi-container objects known as pods that share a networking layer, and flat networking for all orchestrated pods, among other things. Kubernetes is community-driven and evolves very fast. One of the features that makes this platform so popular is the availability to create your own kind of resources, allowing us to develop new extensions when they are not available.

We will analyze the features of pods and Kubernetes in detail in Chapter 9, Orchestration Using Kubernetes.

Docker Enterprise provides orchestrators deployed under Universal Control Plane with high availability on all components.

Registry

We have already learned that containers execute processes within an isolated environment, created from a template image. So, the only requirements for deploying that container on a new node will be the container runtime and the template used to create that container. This template can be shared between nodes using simple Docker command options. But this procedure can become more difficult as the number of nodes grows. To improve image distribution, we will use image registries, which are storage points for these kinds of objects. Each image will be stored in its own repository. This concept is similar to code repositories, allowing us to use tags to describe these images, aligning code releases with image versioning.

An application deployment pipeline has different environments, and having a common point of truth between them will help us to manage these objects through the different workflow stages.

Docker provides two different approaches for registry: the community version and Docker Trusted Registry. The community version does not provide any security at all, nor role-based access to image repositories. On the other hand, Docker Trusted Registry comes with the Docker Enterprise solution and is an enterprise-grade registry, with included security, image vulnerability scanning, integrated workflows, and role-based access. We will learn about Docker Enterprise's registry in Chapter 13, Implementing an Enterprise-Grade Registry with DTR.

 

Docker components

In this section, we are going to describe the main Docker components and binaries used for building, distributing, and deploying containers in all execution stages.

Docker Engine is the core component of container platforms. Docker is a client-server application and Docker Engine will provide the server side. This means that we have the main process that runs as a daemon on the host, and a client-side application that communicates with the server using REST API calls.

Docker Engine's latest version provides separate packages for the client and the server. On Ubuntu, for example, if we take a look at the available packages, we will have something like this:
- docker-ce-cli Docker CLI: The open source application container engine
- docker-ce Docker: The open source application container engine

The following diagram represents Docker daemon and its different levels of management:

Docker daemon listens for Docker API requests and will be responsible for all Docker object actions, such as creating an image, list volumes, and running a container.

Docker API is available using a Unix socket by default. Docker API can be used from within code-using interfaces that are available for many programming languages. Querying for running containers can be managed using a Docker client or its API directly; for example, with curl --no-buffer -XGET --unix-socket /var/run/docker.sock http://localhost/v1.24/containers/json.

When deploying cluster-wide environments with Swarm orchestration, daemons will share information between them to allow the execution of distributed services within the pool of nodes.

On the other hand, the Docker client will provide users with the command line required to interact with the daemon. It will construct the required API calls with their payloads to tell the daemon which actions it should execute.

Now, let's deep dive into a Docker daemon component to find out more about its behavior and usage.

Docker daemon

Docker daemon will usually run as a systemd-managed service, although it can run as a standalone process (it is very useful when debugging daemon errors, for example). As we have seen previously, dockerd provides an API interface that allows clients to send commands and interact with this daemon. containerd, in fact, manages containers. It was introduced as a separate daemon in Docker 1.11 and is responsible for managing storage, networking, and interaction between namespaces. Also, it will manage image shipping and then, finally, it will run containers using another external component. This external component, RunC, will be the real executor of containers. Its function just receives an order to run a container. These components are part of the community, so the only one that Docker provides is dockerd. All other daemon components are community-driven and use standard image specifications (Open Containers Initiative OCI). In 2017, Docker donated containerd as part of their contribution to the open source community and is now part of the Cloud Native Computing Foundation (CNCF). OCI was founded as an open governance structure for the express purpose of creating open industry standards around container formats and runtimes in 2015. The CNCF hosts and manages most of the currently most-used components of the newest technology infrastructures. It is a part of the nonprofit Linux Foundation and is involved in projects such as Kubernetes, Containerd, and The Update Framework.

By way of a summary, dockerd will manage interaction with the Docker client. To run a container, first, the configuration needs to be created so that daemon triggers containerd (using gRPC) to create it. This piece will create an OCI definition that will use RunC to run this new container. Docker implements these components with different names (changed between releases), but the concept is still valid.

Docker daemon can listen for Docker Engine API requests on different types of sockets: unix, tcp, and fd. By default, Daemon on Linux will use a Unix domain socket (or IPC socket) that's created at /var/run/docker.sock when starting the daemon. Only root and Docker groups can access this socket, so only root and members of the Docker group will be able to create containers, build images, and so on. In fact, access to a socket is required for any Docker action.

Docker client

Docker client is used to interact with a server. It needs to be connected to a Docker daemon to perform any action, such as building an image or running a container.

A Docker daemon and client can run on the same host system, or we can manage a connected remote daemon. The Docker client and daemon communicate using a server-side REST API. This communication can be executed over UNIX sockets (by default) or a network interface, as we learned earlier.

Docker objects

The Docker daemon will manage all kinds of Docker objects using the Docker client command line.

The following are the most common objects at the time of writing this book:

  • IMAGE
  • CONTAINER
  • VOLUME
  • NETWORK
  • PLUGIN

There are other objects that are only available when we deploy Docker Swarm orchestration:

  • NODE
  • SERVICE
  • SECRET
  • CONFIG
  • STACK
  • SWARM

The Docker command line provides the actions that Docker daemon is allowed to execute via REST API calls. There are common actions such as list (or ls), create, rm (for remove), and inspect, and other actions that are restricted to specific objects, such as cp (for coping).

For example, we can get a list of running containers on a host by running the following command:

$ docker container ls
There are many commonly used aliases, such as docker ps for docker container ls or docker run for docker container run. I recommend using a long command-line format because it is easier to remember if we understand which actions are allowed for each object.

There are other tools available on the Docker ecosystem, such as docker-machine and docker-compose.

Docker Machine is a community tool created by Docker that allows users and administrators to easily deploy Docker Engine on hosts. It was developed in order to fast provision Docker Engine on cloud providers such as Azure and AWS, but it evolved to offer other implementations, and nowadays, it is possible to use many different drivers for many different environments. We can use docker-machine to deploy docker-engine on VMWare (over Cloud Air, Fusion, Workstation, or vSphere), Microsoft Hyper-V, and OpenStack, among others. It is also very useful for quick labs, or demonstration and test environments on VirtualBox or KVM, and it even allows us to provision docker-engine software using SSH. docker-machine runs on Windows and Linux, and provides an integration between client and provisioned Docker host daemons. This way, we can interact with its Docker daemon remotely, without being connected using SSH, for example.

On the other hand, Docker Compose is a tool that will allow us to run multi-container applications on a single host. We will just introduce this concept here in relation to multi-service applications that will run on Swarm or Kubernetes clusters. We will learn about docker-compose in Chapter 5, Deploying Multi-Container Applications.

 

Building, shipping, and running workflows

Docker provides the tools for creating images (templates for containers, remember), distributing those images to systems other than the one used for building the image, and finally, running containers based on these images:

Docker Engine will participate in all workflow steps, and we can use just one host or many during these processes, including our developers' laptops.

Let's provide a quick review of the usual workflow processes.

Building

Building applications using containers is easy. Here are the standard steps:

  1. The developer usually codes an application on their own computer.
  2. When the code is ready, or there is a new release, new functionalities, or a bug has simply been fixed, a commit is deployed.
  3. If our code has to be compiled, we can do it at this stage. If we are using an interpreted language for our code, we will just add it to the next stage.
  4. Either manually or using continuous integration orchestration, we can create a Docker image integrating compiled binary or interpreted code with the required runtime and all its dependencies. Images are our new component artifacts.

We have passed the building stage and the built image, with everything included, must be deployed to production. But first, we need to ensure its functionality and health (Will it work? How about performance?). We can do all these tests on different environments using the image artifact we created.

Shipping

Sharing created artifacts is easier with containers. Here are some of the new steps:

  1. The created image is on our build host system (or even on our laptop). We will push this artifact to an image registry to ensure that it is available for the next workflow processes.
  2. Docker Enterprise provides integrations on Docker Trusted Registry to follow separate steps from the first push, image scanning to look for vulnerabilities, and different image pulls from different environments during continuous integration stages.
  3. All pushes and pulls are managed by Docker Engine and triggered by Docker clients.

Now that the image has been shipped on different environments, during integration and performance tests, we need to launch containers using environment variables or configurations for each stage.

Running

So, we have new artifacts that are easy to share between different environments, but we need to execute them in production. Here are some of the benefits of containers for our applications:

  • All environments will use Docker Engine to execute our containers (processes), but that's all. We don't really need any portion of software other than Docker Engine to execute the image correctly (naturally, we have simplified this idea because we will need volumes and external resources in many cases).
  • If our image passed all the tests defined in the workflow, it is ready for production, and this step will be as simple as deploying the image built originally on the previous environment, using all the required arguments and environment variables or configurations for production.
  • If our environments were orchestration-managed using Swarm or Kubernetes, all these steps would have been run securely, with resilience, using internal load balancers, and with required replicas, among other properties, that this kind of platform provides.

As a summary, keep in mind that Docker Engine provides all the actions required for building, shipping, and running container-based applications.

 

Windows containers

Containers started with Linux, but nowadays, we can run and orchestrate containers on Windows. Microsoft integrated containers on Windows in Windows 2016. With this release, they consolidated a partnership with Docker to create a container engine that runs containers natively on Windows.

After a few releases, Microsoft decided to have two different approaches to containers on Windows, these being the following:

  • Windows Server Containers (WSC), or process containers
  • Hyper-V Containers

Because of the nature of Windows operating system implementation, we can share kernels but we can't isolate processes from the system services and DLLs. In this situation, process containers need a copy of the required system services and many DLLs to be able to make API calls to the underlying host operating system. This means that containers that use process container isolation will run with many system processes and DLLs inside. In this case, images are very big and will have a different kind of portability; we will only be able to run Windows containers based on the same underlying operating system version.

As we have seen, process containers need to copy a portion of the underlying operating system inside in order to run. This means that we can only run the same operating system containers. For example, running containers on top of Windows Server 2016 will require a Windows Server 2016 base image.

On the other hand, Hyper-V containers will not have these limitations because they will run on top of a virtualized kernel. This adds overhead, but the isolation is substantially better. In this case, we won't be able to run these kinds of containers on older Microsoft Windows versions. These containers will use optimized virtualization to isolate the new kernel for our process.

The following diagram represents both types of MS Windows container isolation:

Process isolation is a default container isolation on Windows Server, but Windows 10 Pro and Enterprise will run Hyper-V isolation. Since the Windows 10 October 2018 update, we can choose to use old-style process isolation with the --isolation=process flag on Windows 10 Pro and Enterprise.
Please check the Windows operating system's portability because this is a very common problem on Windows containers.

Networking in Windows containers is different from Linux. The Docker host uses a Hyper-V virtual switch to provide connectivity to containers and connects them to virtual switches using either a host virtual interface (Windows Server containers) or a synthetic VM interface (Hyper-V containers).

 

Customizing Docker

Docker behavior can be managed at daemon and client levels. These configurations can be executed using command-line arguments, environment variables, or definitions on configuration files.

Customizing the Docker daemon

Docker daemon behavior is managed by various configuration files and variables:

  • key.json: This file contains a unique identifier for this daemon; in fact, it is the daemon's public key that uses the JSON web key format.
  • daemon.json: This is the Docker daemon configuration file. It contains all its parameters in JSON format. It has a key-value (or list of values) format in which all the daemon's flags will be available to modify its behavior. Be careful with configurations implemented on the systemd service file because they must not conflict with options set via the JSON file; otherwise, the daemon will fail to start.
  • Environment variables: HTTPS_PROXY, HTTP_PROXY, and NO_PROXY (or using lowercase) will manage the utilization of Docker daemon and the client behind the proxies. The configuration can be implemented in the Docker daemon systemd unit config files using, for example, /etc/systemd/system/docker.service.d/http-proxy.conf, and following the content for HTTPS_PROXY (the same configuration might be applied to HTTP_PROXY):
[Service]
Environment="HTTPS_PROXY=https://proxy.example.com:443/" "NO_PROXY=localhost,127.0.0.1,docker-registry.example.com,.corp"

Be careful with the key.json file while cloning virtual machines because using the same keys on different daemons will result in strange behaviors. This file is owned by system administrators, so you will need to use a privileged user to review its content. This JSON file contains Docker Daemon's certificate in JSON Web Key format. We can just review the key.json file's content using the cat and jq commands (jq is not required, but I used it to format output. This command will help with JSON files or JSON output):

$ sudo cat /etc/docker/key.json |jq
{
"crv": "P-256",
"d": "f_RvzIUEPu3oo7GLohd9cxqDlT9gQyXSfeWoOnM0ZLU",
"kid": "QP6X:5YVF:FZAC:ETDZ:HOHI:KJV2:JIZW:
IG47:3GU6:YQJ4:YRGF:VKMP",

"kty": "EC",
"x": "y4HbXr4BKRi5zECbJdGYvFE2KtMp9DZfPL81r_qe52I",
"y": "ami9cOOKSA8joCMwW-y96G2mBGwcXthYz3FuK-mZe14"
}

The daemon configuration file, daemon.json, will be located by default at the following locations:

  • /etc/docker/daemon.json on Linux systems
  • %programdata%\docker\config\daemon.json on Windows systems

In both cases, the configuration file's location can be changed using --config-file to specify a custom non-default file.

Let's provide a quick review of the most common and important flags or keys we will configure for Docker daemon. Some of these options are so important that they are usually referenced in the Docker Certified Associate exam. Don't worry; we will learn about the most important ones, along with their corresponding JSON keys, here:

Daemon argument JSON key Argument description
-b, --bridge string bridge Attach containers to a network bridge. This option allows us to change the default bridge behavior. In some cases, it's useful to create your own bridge interfaces and use the Docker daemon attached to one of them.
--cgroup-parent string cgroup-parent Set the parent cgroup for all containers.
-D, --debug debug This option enables debug mode, which is fundamental to resolving issues. Usually, it's better to stop Docker service and run the Docker daemon by hand using the -D option to review all dockerd debugging events.
--data-root string data-root This is the root directory of the persistent Docker state (default /var/lib/docker). With this option, we can change the path to store all Docker data (Swarm KeyValue, images, internal volumes, and so on).
--dns list dns This is the DNS server to use (default []). These three options allow us to change the container DNS behavior, for example, to use a specific DNS for the container environment.
--dns-opt list dns-opt These are the DNS options to use (default []).
--dns-search list dns-search These are the DNS search domains to use (default []).
--experimental experimental This enables experimental features; don't use it in production.
-G, --group string group This is the group for the Unix socket (default docker).
-H, --host list host This is the option that allows us to specify the socket(s) to use.
--icc icc This enables inter-container communication (default true). With this option, we can disable any container's internal communications.
--ip IP ip This is the default IP when binding container ports (default 0.0.0.0). With this option, we can ensure that only specific subnets will have access to container-exposed ports.
--label list label Set key=value labels to the daemon (default []). With labels, we can configure environment properties for container location when using a cluster of hosts. There is a better tagging method you can use when using Swarm, as we will learn in Chapter 8, Orchestration Using Docker Swarm.
--live-restore live-restore This enables the live restoration of Docker when containers are still running.
--log-driver string log-driver This is the default driver for container logs (default json-file) if we need to use an external log manager (ELK framework or just a Syslog Server, for example).
-l, --log-level string log-level This sets the logging level (debug, info, warn, error, fatal) (default info).
--seccomp-profile string seccomp-profile This is the path to the seccomp profile if we want to use anything other than the default option.
--selinux-enabled selinux-enabled Enables SELinux support. This option is crucial for production environments using Red Hat Linux/CentOS. It is disabled by default.
-s, --storage-driver string storage-driver This is the storage driver to use. This argument allows us to change the default driver selected by Docker. In the latest versions, we will use overlay2 because of its stability and performance. Other options include aufs, btrfs, and devicemapper.
--storage-opt list storage-opts Storage driver options (default []). Depending on the storage driver used, we will need to add options as arguments, for example, using devicemapper or for specifying a maximum container size on overlay2 or Windows filter (MS Windows copy-on-write implementation).
--tls tls This option enables TLS encryption between client and server (implied by --tlsverify).
--tlscacert string tlscacert Trust certs signed only by this CA (default ~/.docker/ca.pem).
--tlscert string tlscert This is the path to the TLS certificate file (default ~/.docker/cert.pem).
--tlskey string tlskey This is the path to the TLS key file (default ~/.docker/key.pem).
--tlsverify tlsverify Use TLS and verify the remote.

Logging information in container environments can be deployed using different layers of knowledge. As shown in the previous table, Docker daemon has its own logging configuration using --log-driver. This configuration will be applied to all containers by default if we do not specify any configuration during container execution. Therefore, we can redirect all container logs to some remote logging system using the ELK framework, for example (https://www.elastic.co/es/what-is/elk-stack), while some specific containers can be redirected to another logging backend. This can also be applied locally using different logging drivers.

Docker client customization

The client will store its configuration under the users' home directory on .docker. There is a config file where the Docker client will look for its configurations ($HOME/.docker/config.json on Linux or %USERPROFILE%/.docker/config.json on Windows). In this file, we will set a proxy for our containers if it's needed to connect to the internet or other external services, for example.

If we need to pass proxy settings to containers upon startup, we will configure the proxies key in .docker/config.json for our user, for example, using my-company-proxy:

"proxies":
{
"default":
{
"httpProxy": "http://my-company-proxy:3001",
"httpsProxy": "http://my-company-proxy:3001",
"noProxy": "*.test.example.com,.example2.com"
}
}

These configurations can be added as arguments when starting up the Docker container, as follows:

--env HTTP_PROXY="http://my-company-proxy:3001"
--env HTTPS_PROXY="https://my-company-proxy:3001"
--env NO_PROXY="*.test.example.com,.example2.com"

We will see what "environment option" means in Chapter 3, Running Docker Containers. Just keep in mind that, sometimes, our corporate environment will need applications to use proxies and that there are methods to configure these settings, either as user variables or using client configurations.

Other client features, such as experimental flags or output formatting, will be configured in the config.json file. Here is an example of some configurations:

{
"psFormat": "table {{.ID}}\\t{{.Image}}\\t{{.Command}}\\t{{.Labels}}",
"imagesFormat": "table {{.ID}}\\t{{.Repository}}\\t{{.Tag}}\\t{{.CreatedAt}}",
"statsFormat": "table {{.Container}}\t{{.CPUPerc}}\t{{.MemUsage}}"
}
 

Docker security

There are many topics related to container security. In this chapter, we will review the ones related to the container runtime.

As we have seen, Docker provides a client-server environment. From the client side, there are a few things that will improve the way in which we will be able to access the environment.

Configuration files and certificates for different clusters on hosts must be secured using filesystem security at the operating system level. However, as you should have noticed, a Docker client always needs a server in order to do anything with containers. Docker client is just the tool to connect to servers. With this picture in mind, client-server security is a must. Now, let's take a look at different kinds of access to the Docker daemon.

Docker client-server security

The Docker daemon will listen on system sockets (unix, tcp, and fd). We have seen that we can change this behavior and that, by default, the daemon will listen on the /var/run/docker.sock local Unix socket.

Giving users RW access to /var/run/docker.sock will add access to the local Docker daemon. This allows them to create images, run containers (even privileged, root user containers, and mount local filesystems inside them), create images, and more. It is very important to know who can use your Docker engine. If you deployed a Docker Swarm cluster, this is even worse because if the accessed host has a master role, the user will be able to create a service that will run containers across the entirety of the cluster. So keep your Docker daemon socket safe from non-trusted users and only allow authorized ones (in fact, we will look at other advanced mechanisms to provide secure user access to the container platform).

Docker daemon is secure by default because it does not export its service. We can enable remote TCP accesses by adding -H tcp://<HOST_IP> to the Docker daemon start process. By default, port 2375 will be used. If we use 0.0.0.0 as the host IP address, Docker daemon will listen on all interfaces.

We can enable remote access to Docker daemon using a TCP socket. By default, communication will not be secure and the daemon will listen on port 2375. To ensure that the client-to-daemon connection is encrypted, you will need to use either a reverse proxy or built-in TLS-based HTTPS encrypted socket. We can allow the daemon to listen on all host interface IP addresses or just one using this IP when starting the daemon. To use TLS-based communications, we need to follow this procedure (assuming your server hostname is in the $HOST variable):

  1. Create a certificate authority (CA). The following commands will create its private and public keys:
$ openssl genrsa -aes256 -out ca-key.pem 4096
Generating RSA private key, 4096 bit long modulus
............................................................................................................................................................................................++
........++
e is 65537 (0x10001)
Enter pass phrase for ca-key.pem:
Verifying - Enter pass phrase for ca-key.pem:
$ openssl req -new -x509 -days 365 -key ca-key.pem -sha256 -out ca.pem
Enter pass phrase for ca-key.pem:
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [AU]:
State or Province Name (full name) [Some-State]:Queensland
Locality Name (eg, city) []:Brisbane
Organization Name (eg, company) [Internet Widgits Pty Ltd]:Docker Inc
Organizational Unit Name (eg, section) []:Sales
Common Name (e.g. server FQDN or YOUR name) []:$HOST
Email Address []:Sven@home.org.au
  1. Create a server CA-signed key, ensuring that the common name matches the hostname you use to connect to Docker daemon from the client:
$ openssl genrsa -out server-key.pem 4096
Generating RSA private key, 4096 bit long modulus
.....................................................................++
.................................................................................................++
e is 65537 (0x10001)

$ openssl req -subj "/CN=$HOST" -sha256 -new -key server-key.pem -out server.csr
$ echo subjectAltName = DNS:$HOST,IP:10.10.10.20,IP:127.0.0.1 >> extfile.cnf
$ echo extendedKeyUsage = serverAuth >> extfile.cnf
$ openssl x509 -req -days 365 -sha256 -in server.csr -CA ca.pem -CAkey ca-key.pem \
-CAcreateserial -out server-cert.pem -extfile extfile.cnf

Signature ok
subject=/CN=your.host.com
Getting CA Private Key
Enter pass phrase for ca-key.pem:
  1. Start Docker daemon with TLS enabled and use arguments for the CA, server certificate, and CA-signed key. This time, Docker daemon using TLS will run on port 2376 (which is standard for the daemon TLS):
$ chmod -v 0400 ca-key.pem key.pem server-key.pem
$ chmod -v 0444 ca.pem server-cert.pem cert.pem
$ dockerd --tlsverify --tlscacert=ca.pem --tlscert=server-cert.pem --tlskey=server-key.pem \
-H=0.0.0.0:2376
  1. Using the same CA, create a client CA-signed key, specifying that this key will be used for client authentication:
$ openssl genrsa -out key.pem 4096
Generating RSA private key, 4096 bit long modulus
.........................................................++
................++
e is 65537 (0x10001)
$ openssl req -subj '/CN=client' -new -key key.pem -out client.csr
$ echo extendedKeyUsage = clientAuth > extfile-client.cnf
$ openssl x509 -req -days 365 -sha256 -in client.csr -CA ca.pem -CAkey ca-key.pem \
-CAcreateserial -out cert.pem -extfile extfile-client.cnf
Signature ok
subject=/CN=client
Getting CA Private Key
Enter pass phrase for ca-key.pem:
  1. We will move generated client certificates to the client's host (the client's laptop, for example). We will also copy the public CA certificate file. With its own client certificates and the CA, we will be able to connect to a remote Docker daemon using TLS to secure the communications. We will use the Docker command line with --tlsverify and other arguments to specify the server's same CA, the client certificate, and its signed key (the daemon's default port for TLS communications is 2376). Let's review an example using docker version:
$ docker --tlsverify --tlscacert=ca.pem --tlscert=cert.pem --tlskey=key.pem -H=$HOST:2376 version

All these steps should be done to provide TLS communications, and steps 4 and 5 should be undertaken for all client connections if we want to identify their connections (if you don't want to use a unique client certificate/key pair, for example). On enterprise environments, with hundreds or even thousands of users, this is ungovernable and Docker Enterprise will provide a better solution with all these steps included automatically, thereby providing granulated accesses.

Since Docker version 18.09, we can interact with Docker daemon using the $ docker -H ssh://me@example.com:22 ps command, for example. To use the SSH connection, you need to set up an ssh public key authentication.

Docker daemon security

Docker container runtime security is based on the following:

  • Security provided by the kernel to containers
  • The attack surface of the runtime itself
  • Operating system security applied to the runtime

Let's take a look at these in more detail.

Namespaces

We have been talking about kernel namespaces and how they implement the required isolation for containers. Every container runs with the following namespaces:

  • pid: Process isolation (Process ID PID)
  • net: Manages network interfaces (Networking NET)
  • ipc: Manages access to IPC resources (InterProcess Communication IPC)
  • mnt: Manages filesystem mount points (Mount MNT)
  • uts: Isolates kernel and version identifiers (Unix Timesharing System UTS)

As each container runs with its own pid namespace, it will only have access to the listed process on this namespace. The net namespace will provide its own interfaces, which will allow us to start many processes using the same port on different containers. Container visibility is enabled by default. All containers will have access to external networks using host bridge interfaces.

A complete root filesystem will be inside each container, and it will use this as a standard Unix filesystem (with its own /tmp, and network files such as /etc/hosts and /etc/resolv.conf). This dedicated filesystem is based on copy-on-write, using different layers from images.

Namespaces provide layers of isolation for the container, and control groups will manage how many resources will be available for the container. This will ensure that the host will not get exhausted. In multi-tenant environments, or just for production, it is very important to manage the resources of containers and to not allow non-limited containers.

The attack surface of the daemon is based on user access. By default, Docker daemon does not provide any role-based access solution, but we have seen that we can ensure an encrypted communication for external clients.

As Docker daemon runs as root (the experimental mode will allow us to run rootless), all containers will be able to, for example, mount any directory on your host. This can be a real problem and that is why it's so important to ensure that only required users have access to the Docker socket (local or remote).

As we will see in Chapter 3, Running Docker Containers, containers will run as root if we don't specify a user on image building or container startup. We will review this topic later and improve this default user usage.

It is recommended to run just Docker daemon on server-dedicated hosts because Docker can be dangerous in the wrong hands when it comes to other services running on the same host.

User namespace

As we've already seen, Linux namespaces provide isolation for processes. These processes just see what cgroups and these namespaces offer, and for these processes, they are running along on their own.

We always recommend running processes inside containers as non-root users (nginx, for example, does not require root to be running if we use upper ports), but there are some cases where they must be run as root. To prevent privilege escalation from within these root containers, we can apply user remapping. This mechanism will map a root user (UID 0) inside the container, with the user's non-root (UID 30000).

User remapping is managed by two files:

  • /etc/subid: This sets the user ID range for subordinates.
  • /etc/subgid: This sets the group ID range for subordinates.

With these files, we set the first sequence ID for users and groups, respectively. This is an example format for the subordinate ID, nonroot:30000:65536. This means that UID 0 inside the container will be mapped as UID 30000 on the Docker host and so forth.

We will configure Docker daemon to use this user remapping with the --userns-remap flag or the userns-remap key in JSON format. In special cases, we can change the user namespace behavior when running the container.

Kernel capabilities (seccomp)

By default, Docker starts containers with a restricted set of capabilities. This means that containers will run unprivileged by default. So, running processes inside containers improves application security by default.

These are the 14 capabilities available by default to any container running in your system: SETPCAP, MKNOD, AUDIT_WRITE, CHOWN, NET_RAW, DAC_OVERRIDE, FOWNER, FSETID, KILL, SETGID, SETUID, NET_BIND_SERVICE, SYS_CHROOT, and SETFCAP.

The most important thing to understand at this point is that we can run processes inside a container listening on ports under 1024 because we have NET_BIND_SERVICE capability, for example, or that we can use ICMP inside containers because we have NET_RAW capability enabled.

On the other hand, there are many capabilities not enabled by default. For example, there are many system operations that will need SYS_ADMIN capability, or we will need NET_ADMIN capability to create new interfaces (running openvpn inside Docker containers will require it).

Processes will not have real root privileges inside containers. Using seccomp capabilities, it is possible to do the following:

  • Deny mount operations
  • Deny access to raw sockets (to prevent packet spoofing)
  • Deny access to some filesystem operations, such as file ownership
  • Deny module loading, and many others

The permitted capabilities are defined using a default seccomp profile. Docker uses seccomp in filter mode, disabling all non-whitelisted calls defined on its own JSON format in profile files. There is a default profile that will be used when running containers. We can use our own seccomp profile using the --security-opt flag on launch. So, manipulating allowed capabilities is easy during container execution. We will learn more about how to manipulate the behavior of any container at the start of Chapter 3, Running Docker Containers:

$ docker container run --cap-add=NET_ADMIN--rm -it --security-opt seccomp=custom-profile.json alpine sh

This line will run our container, adding the NET_ADMIN capability. Using a custom seccomp profile, we will be adding even more, as defined on custom-profile.json. For security reasons, we can even use --cap-drop to drop some of the default capabilities if we are sure that we don't need them.

Avoid using the --privileged flag as your container will run unconfined, which means that it will run nearly with the same access to the host as processes running outside containers on the host. In this case, resources will be unlimited for this container (the SYS_RESOURCE capability will be enabled and limit flags will not be used). Best practice for users would be to remove all capabilities except those required by the process to work.

Linux security modules

Linux operating systems provide tools to ensure security. In some cases, they come installed and configured by default in out-of-the-box installations, while in other cases, they will require additional administrator interaction.

AppArmor and SELinux are probably the most common. Both provide finer-grained control over file operations and other security features. For example, we can ensure that only the allowed process can modify some special files or directories (for example, /etc/passwd).

Docker provides templates and policies that are installed with the product that ensures complete integration with these tools to harden Docker hosts. Never disable SELinux or AppArmor on production and use policies to add features or accesses for your processes.

We can review which security modules are enabled in our Docker runtime by looking at the SecurityOptions section of the Docker system info output.

We can easily review Docker runtime features using docker system info. It is good to know that the output can be displayed in JSON format using docker system info --format '{{json .}}' and that we can filter by using the --filter option. Filtering allows us, for example, to retrieve only security options applied to the docker system info --format '{{json .SecurityOptions}}' daemon.

By default, Red Hat flavor hosts will not have SELinux enabled, but, on the other hand, Ubuntu will run by default with AppArmor.

There is a very common issue when we move the default Docker data root path to another location in Red Hat Linux. If SELinux is enabled (by default on these systems), you will need to add a new path to the allowed context by using # semanage fcontext -a -e /var/lib/docker _MY_NEW_DATA-ROOT_PATH and then # restorecon -R -v _MY_NEW_DATA-ROOT_PATH.

Docker Content Trust

Docker Content Trust is the mechanism provided by Docker to improve content security. It will provide image ownership and verification of immutability. This option, which is applied at Docker runtime, will help to harden content execution. We can ensure that only certain images can run on Docker hosts. This will provide two different levels of security:

  • Only allow signed images
  • Only allow signed images by certain users or groups/teams (we will learn about the concepts that are integrated with Docker UCP in Chapter 11, Universal Control Plane)

We will learn about volumes, which are the objects used for container persistent storage, in Chapter 4, Container Persistency and Networking.

Enabling and disabling Docker Content Trust can be managed by setting the DOCKER_CONTENT_TRUST=1 environment variable in a client session, in the systemd Docker unit. Alternatively, we can use --disable-content-trust=false (true by default) on image and container operations.

With any of these flags enabling content trust, all Docker operations will be trusted, which means that we won't be able to download and execute any non-trusted flags (signed images).

 

Chapter labs

We will use CentOS 7 as the operating system for the node labs in this book, unless otherwise indicated. We will install Docker Community Edition now and Docker Enterprise for the specific chapters pertaining to this platform.

Deploy environments/standalone-environment from this book's GitHub repository (https://github.com/PacktPublishing/Docker-Certified-Associate-DCA-Exam-Guide.git) if you have not done so yet. You can use your own CentOS 7 server. Use vagrant up from the environments/standalone-environment folder to start your virtual environment.

If you are using a standalone environment, wait until it is running. We can check the statuses of the nodes using vagrant status. Connect to your lab node using vagrant ssh standalone. standalone is the name of your node. You will be using the vagrant user with root privileges using sudo. You should get the following output:

Docker-Certified-Associate-DCA-Exam-Guide/environments/standalone$ vagrant up
Bringing machine 'standalone' up with 'virtualbox' provider...
==> standalone: Cloning VM...
==> standalone: Matching MAC address for NAT networking...
==> standalone: Checking if box 'frjaraur/centos7' version '1.4' is up to date...
==> standalone: Setting the name of the VM: standalone
...
==> standalone: Running provisioner: shell...
standalone: Running: inline script
standalone: Delta RPMs disabled because /usr/bin/applydeltarpm not installed.
Docker-Certified-Associate-DCA-Exam-Guide/environments/standalone$ vagrant status
Current machine states:
standalone running (virtualbox)
...
Docker-Certified-Associate-DCA-Exam-Guide/environments/standalone$

We can now connect to a standalone node using vagrant ssh standalone. This process may vary if you've already deployed a standalone virtual node before and you just started it using vagrant up:

Docker-Certified-Associate-DCA-Exam-Guide/environments/standalone$ vagrant ssh standalone
[vagrant@standalone ~]$

Now, you are ready to start the labs.

Installing the Docker runtime and executing a "hello world" container

This lab will guide you through the Docker runtime installation steps and running your first container. Let's get started:

  1. To ensure that no previous versions are installed, we will remove any docker* packages:
[vagrant@standalone ~]$ sudo yum remove docker*
  1. Add the required packages by running the following command:
[vagrant@standalone ~]$ sudo yum install -y yum-utils   device-mapper-persistent-data   lvm2
  1. We will be using a stable release, so we will add its package repository, as follows:
[vagrant@standalone ~]$ sudo yum-config-manager \
--add-repo https://download.docker.com/linux/centos/docker-ce.repo
  1. Now, install Docker packages and containerd. We are installing the server and client on this host (since version 18.06, Docker provides different packages for docker-cli and Docker daemon):
[vagrant@standalone ~]$ sudo yum install -y docker-ce docker-ce-cli containerd.io
  1. Docker is installed, but on Red Hat-like operating systems, it is not enabled on boot by default and will not start. Verify this situation and enable and start the Docker service:
[vagrant@standalone ~]$ sudo systemctl enable docker
[vagrant@standalone ~]$ sudo systemctl start docker
  1. Now that Docker is installed and running, we can run our first container:
[vagrant@standalone ~]$ sudo docker container run hello-world
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
1b930d010525: Pull complete
Digest:
sha256:b8ba256769a0ac28dd126d584e0a2011cd2877f3f76e093a7ae560f2a5301c00
Status: Downloaded newer image for hello-world:latest

Hello from Docker!

This message shows that your installation appears to be working correctly. To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.

2. The Docker daemon pulled the "hello-world" image from the Docker Hub. (amd64)
3. The Docker daemon created a new container from that image that runs the executable, which produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
https://hub.docker.com/.


For more examples and ideas, visit:
https://docs.docker.com/get-started/.

This command will send a request to Docker daemon to run a container based on the hello-world image, located on Docker Hub (http://hub.docker.com). To use this image, Docker daemon downloads all the layers if we have not executed any container with this image before; in other words, if the image is not present on the local Docker host. Once all the image layers have been downloaded, Docker daemon will start a hello-world container.

This book is a guide for the DCA exam and is the simplest lab we can easily deploy. However, you should be able to understand and describe this simple process, as well as think about all the common issues that we may encounter. For example, what happens if the image is on your host and is different, but with the same name and tags? What happens if one layer cannot be downloaded? What happens if you are connected to a remote daemon? We will review some of these questions at the end of this chapter.
  1. As you should have noticed, we are always using sudo to root because our user has not got access to the Docker UNIX socket. This is the first security layer an attacker must bypass on your system. We usually enable a user to run containers in production environments because we want to isolate operating system responsibilities and management from Docker. Just add our user to the Docker group, or add a new group of users with access to the socket. In this case, we will just add our lab user to the Docker group:
[vagrant@standalone ~]$ docker container ls
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.40/containers/json
: dial unix /var/run/docker.sock: connect: permission denied

[vagrant@standalone ~]$ sudo usermod -a -G docker $USER

[vagrant@standalone ~]$ newgrp docker

[vagrant@standalone ~]$ docker container ls -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5f7abd49b3e7 hello-world "/hello" 19 minutes ago Exited (0) 19 minutes ago festive_feynman

Docker runtime processes and namespace isolation

In this lab, we are going to review what we learned about process isolation and Docker daemon components and execution workflow. Let's get started:

  1. Briefly review the Docker systemd daemon:
[vagrant@standalone ~]$ sudo systemctl status docker
● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
Active: active (running) since sáb 2019-09-28 19:34:30 CEST; 25min ago
Docs: https://docs.docker.com
Main PID: 20407 (dockerd)
Tasks: 10
Memory: 58.9M
CGroup: /system.slice/docker.service
└─20407 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock

sep 28 19:34:30 centos7-base dockerd[20407]: time="2019-09-28T19:34:30.222200934+02:00" level=info msg="[graphdriver] using prior storage driver: overlay2"
sep 28 19:34:30 centos7-base dockerd[20407]: time="2019-09-28T19:34:30.234170886+02:00" level=info msg="Loading containers: start."
sep 28 19:34:30 centos7-base dockerd[20407]: time="2019-09-28T19:34:30.645048459+02:00" level=info msg="Default bridge (docker0) is assigned with an IP a... address"
sep 28 19:34:30 centos7-base dockerd[20407]: time="2019-09-28T19:34:30.806432227+02:00" level=info msg="Loading containers: done."
sep 28 19:34:30 centos7-base dockerd[20407]: time="2019-09-28T19:34:30.834047449+02:00" level=info msg="Docker daemon" commit=6a30dfc graphdriver(s)=over...n=19.03.2
sep 28 19:34:30 centos7-base dockerd[20407]: time="2019-09-28T19:34:30.834108635+02:00" level=info msg="Daemon has completed initialization"
sep 28 19:34:30 centos7-base dockerd[20407]: time="2019-09-28T19:34:30.850703030+02:00" level=info msg="API listen on /var/run/docker.sock"
sep 28 19:34:30 centos7-base systemd[1]: Started Docker Application Container Engine.
sep 28 19:34:43 centos7-base dockerd[20407]: time="2019-09-28T19:34:43.558580560+02:00" level=info msg="ignoring event" module=libcontainerd namespace=mo...skDelete"
sep 28 19:34:43 centos7-base dockerd[20407]: time="2019-09-28T19:34:43.586395281+02:00" level=warning msg="5f7abd49b3e75c58922c6e9d655d1f6279cf98d9c325ba2d3e53c36...

This output shows that the service is using a default systemd unit configuration and that dockerd is using the default parameters; that is, it's using the file descriptor socket on /var/run/docker.sock and the default docker0 bridge interface.

  1. Notice that dockerd uses a separate containerd process to execute containers. Let's run some containers in the background and review their processes. We will run a simple alpine with an nginx daemon:
[vagrant@standalone ~]$ docker run -d nginx:alpine
Unable to find image 'nginx:alpine' locally
alpine: Pulling from library/nginx
9d48c3bd43c5: Already exists
1ae95a11626f: Pull complete
Digest: sha256:77f340700d08fd45026823f44fc0010a5bd2237c2d049178b473cd2ad977d071
Status: Downloaded newer image for nginx:alpine
dcda734db454a6ca72a9b9eef98aae6aefaa6f9b768a7d53bf30665d8ff70fe7
  1. Now, we will look for the nginx and containerd processes (process IDs will be completely different on your system; you just need to understand the workflow):
[vagrant@standalone ~]$ ps -efa|grep -v grep|egrep -e containerd -e nginx  
root 15755 1 0 sep27 ? 00:00:42 /usr/bin/containerd
root 20407 1 0 19:34 ? 00:00:02 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
root 20848 15755 0 20:06 ? 00:00:00 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/dcda734db454a6ca72a9
b9eef98aae6aefaa6f9b768a7d53bf30665d8ff70fe7 -address /run/containerd/containerd.sock -containerd-binary /usr/bin/containerd -runtime-root /var/run/docker/runtime-runc
root 20863 20848 0 20:06 ? 00:00:00 nginx: master process nginx -g daemon off;
101 20901 20863 0 20:06 ? 00:00:00 nginx: worker process
  1. Notice that, at the end, the container started from 20848 PID. Following the runtime-runc location, we discover state.json, which is the container state file:
[vagrant@standalone ~]$ sudo ls -laRt /var/run/docker/runtime-runc/moby
/var/run/docker/runtime-runc/moby:
total 0
drwx--x--x. 2 root root 60 sep 28 20:06 dcda734db454a6ca72a9b9eef98aae6aefaa6f9b768a7d53bf30665d8ff70fe7
drwx------. 3 root root 60 sep 28 20:06 .
drwx------. 3 root root 60 sep 28 13:42 ..
/var/run/docker/runtime-runc/moby/dcda734db454a6ca72a9b9eef98aae6aefaa6f9b768a7d53bf30665d8ff70fe7:
total 28
drwx--x--x. 2 root root 60 sep 28 20:06 .
-rw-r--r--. 1 root root 24966 sep 28 20:06 state.json
drwx------. 3 root root 60 sep 28 20:06 ..

This file contains container runtime information: PID, mounts, devices, capabilities applied, resources, and more.

  1. Our NGINX server runs under PID 20863 and the nginx child process with PID 20901 on the Docker host, but let's take a look inside:
[vagrant@standalone ~]$ docker container exec dcda734db454 ps -ef
PID USER TIME COMMAND
1 root 0:00 nginx: master process nginx -g daemon off;
6 nginx 0:00 nginx: worker process
7 root 0:00 ps -ef

Using docker container exec, we can run a new process using a container namespace. This is like running a new process inside the container.

As you can observe, inside the container, nginx has PID 1 and it is the worker process parent. And, of course, we see our command, ps -ef, because it was launched using its namespaces.

We can run other containers using the same image and we will obtain the same results. Processes inside each container are isolated from other containers and host processes, but users on the Docker host will see all the processes, along with their real PIDs.

  1. Let's take a look at nginx process namespaces. We will use the lsns command to review all the host-running process's namespaces. We will obtain a list of all running processes and their namespaces. We will look for nginx processes (we will not use grep to filter the output because we want to read the headers):
[vagrant@standalone ~]$ sudo lsns
NS TYPE NPROCS PID USER COMMAND
..............
..............
4026532197 mnt 2 20863 root nginx: master process nginx -g daemon off
4026532198 uts 2 20863 root nginx: master process nginx -g daemon off
4026532199 ipc 2 20863 root nginx: master process nginx -g daemon off
4026532200 pid 2 20863 root nginx: master process nginx -g daemon off
4026532202 net 2 20863 root nginx: master process nginx -g daemon off

This lab demonstrated process isolation within a process running inside containers.

Docker capabilities

This lab will cover seccomp capability management. We will launch containers using dropped capabilities to ensure that, by using seccomp to avoid some system calls, processes in containers will only execute allowed actions. Let's get started:

  1. First, run a container using the default allowed capabilities. During the execution of this alpine container, we will change the ownership of the /etc/passwd file:
[vagrant@standalone ~]$ docker container run --rm -it alpine sh -c "chown nobody /etc/passwd; ls -l /etc/passwd"
-rw-r--r-- 1 nobody root 1230 Jun 17 09:00 /etc/passwd

As we can see, there is nothing to stop us from changing whatever file ownership resides inside the container's filesystem because the main process (in this case, /bin/sh) runs as the root user.

  1. Drop all the capabilities. Let's see what happens:
[vagrant@standalone ~]$ docker container run --rm -it --cap-drop=ALL alpine sh -c "chown nobody /etc/passwd; ls -l /etc/passwd"
chown: /etc/passwd: Operation not permitted
-rw-r--r-- 1 root root 1230 Jun 17 09:00 /etc/passwd

You will observe that the operation was forbidden. Since containers run without any capabilities, the chown command is not allowed to change file ownership.

  1. Now, just add the CHOWN capability to allow a change of ownership for files inside the container:
[vagrant@standalone ~]$ docker container run --rm -it --cap-drop=ALL --cap-add CHOWN alpine sh -c "chown nobody /etc/passwd; ls -l /etc/passwd"
-rw-r--r-- 1 nobody root 1230 Jun 17 09:00 /etc/passwd
 

Summary

In this chapter, we have seen how modern applications are based on microservices. We learned what containers are and their benefits, and how microservices and containers match when we associate a process with specific functionality or a task (microservice) and we run it inside a container. We reviewed container concepts. Then, we talked about images, containers, and the mechanisms that isolate processes from the host. We introduced orchestration and registries as requirements for deploying applications with resilience on cluster environments and the ways in which we can manage images.

We then have learned about Docker's main components and how Docker Client interacts with Docker Engine securely. We introduced the most common Docker objects and the workflow we will use to create, share, and deploy new applications based on containers.

Nowadays, we can use containers on Microsoft Windows, but this all started with Linux. We compared both approaches to understand the similarities and differences between them and the advanced methods used to isolate processes on Windows using Hyper-V.

Finally, we reviewed how to configure Docker Engine using JSON files and environment variables, learned that containers are secure by default, and reviewed the different mechanisms used to accomplish this.

In the next chapter, we will build images using different methods and learn the processes and primitives necessary to create good images.

 

Questions

  1. Is it true that we can only run one process per container? (select which sentences are true)

a) We cannot execute more than one process per container. This is a limitation.
b) We can run more than one process per container, but it is not recommended.
c) We will only run one process per container to follow microservices logic.
d) All of the above sentences are false.

  1. What kernel facilities provide host CPU resource isolation on containers?

a) Kernel namespaces.
b) Cgroups (control groups).
c) Kernel domains.
d) None of them. It is not possible to isolate host resources.

  1. Which of the following sentences are true?

a) All containers will run as root by default.
b) The user namespace will allow us to map UID 0 to another one on our host system, controlled and without any non-required privileges.
c) As the Docker daemon runs as root, only root users can run containers on Docker hosts.
d) All of the above sentences are false.

  1. What have we learned about Windows Docker hosts?

a) Linux containers can run on Windows hosts too.
b) Windows Hyper-V containers will run a small virtual machine, providing the required resources for containers and do not have any Windows operating system dependencies.
c) Windows Process Isolation requires system DLLs and services on containers to run properly, and do not provide complete portability.
d) Windows images are bigger than Linux ones because Windows operating system component integrations are required in many cases to run even small processes.

  1. Which of the following sentences are true regarding the Docker daemon configuration?

a) We will configure Docker daemon on Linux using JSON format keys and values on /etc/docker/daemon.json or systemd unit files.
b) On Windows hosts, we will use %programdata%\docker\config\daemon.json to configure Docker daemon.
c) By default, the Docker client connection to the remote Docker daemon is insecure.
d) None of the above sentences are true.

 

Further reading

About the Author
  • Francisco Javier Ramírez Urea

    Francisco Javier Ramírez Urea has been working in the IT industry since 1999 after receiving his bachelor's degree in physical sciences. He worked as a systems administrator for years before specializing in the monitoring of networking devices, operating systems, and applications. In 2015, he started working with software container technologies as an application architecture consultant and has developed his career and evolved with these technologies ever since. He became a Docker Captain in 2018 alongside being a Docker and Kubernetes teacher with various certifications (CKA, CKS, Docker DCA and MTA, and RedHat RHCE/RHCSA). He currently works at the European Union Satellite Centre as a SecDevOps officer.

    Browse publications by this author
Latest Reviews (1 reviews total)
Excellent guide material and content
Docker Certified Associate (DCA): Exam Guide
Unlock this book and the full library FREE for 7 days
Start now