





















































The containerization paradigm has been there in the IT industry for quite a long time in different forms. However, the streamlined stuffing and packaging mission-critical software applications inside highly optimized and organized containers to be efficiently shipped, deployed and delivered from any environment (virtual machines or bare-metal servers, local or remote, and so on) has got a new lease of life with the surging popularity of the Docker platform. The open source Docker project has resulted in an advanced engine for automating most of the activities associated with Docker image formation and the creation of self-defined and self-sufficient, yet collaborative, application-hosted containers. Further on, the Docker platform has come out with additional modules to simplify container discovery, networking, orchestration, security, registry, and so on. In short, the end-to-end life cycle management of Docker containers is being streamlined sagaciously, signaling a sharp reduction of workloads of application developers and system administrators. In this article, we want to highlight the Docker paradigm and its illustrating and insightful capabilities for bringing forth newer possibilities and fresh opportunities.
In this article by Pethuru Raj, Jeeva S. Chelladhurai, Vinod Singh, authors of the book Learning Docker, we will cover the reason behind the Docker platform's popularity.
Docker is an open platform for developers and system administrators of distributed applications. Building and managing distributed applications is beset with a variety of challenges, especially in the context of pervasive cloud environments. Docker brings forth a bevy of automation through the widely leveraged abstraction technique in successfully enabling distributed applications. The prime point is that the much-talked simplicity is being easily achieved through the Docker platform. With just a one-line command, you can install any application. For example, popular applications and platforms, such as WordPress, MySQL, Nginx, and Memcache, can all be installed and configured with just one command. Software and platform vendors are systematically containerizing their software packages to be readily found, subscribed, and used by many. Performing changes on applications and putting the updated and upgraded images in the repository is made easier. That's the real and tectonic shift brought in by the Docker-sponsored containerization movement. There are other advantages with containerization as well, and they are explained in the subsequent sections.
We have been fiddling with the virtualization technique and tools for quite a long time now in order to establish the much-demanded software portability. The inhibiting dependency factor between software and hardware needs to be decimated with the leverage of virtualization, a kind of beneficial abstraction, through an additional layer of indirection. The idea is to run any software on any hardware. This is done by creating multiple virtual machines (VMs) out of a single physical server, and each VM has its own operating system (OS). Through this isolation, which is enacted through automated tools and controlled resource sharing, heterogeneous applications are accommodated in a physical machine.
With virtualization, IT infrastructures become open; programmable; and remotely monitorable, manageable, and maintainable. Business workloads can be hosted in appropriately sized virtual machines and delivered to the outside world, ensuring broader and higher utilization. On the other side, for high-performance applications, virtual machines across multiple physical machines can be readily identified and rapidly combined to guarantee any kind of high-performance needs.
However, the virtualization paradigm has its own drawbacks. Because of the verbosity and bloatedness (every VM carries its own operating system), VM provisioning typically takes some minutes; the performance goes down due to excessive usage of compute resources; and so on. Furthermore, the much-published portability need is not fully met by virtualization. The hypervisor software from different vendors comes in the way of ensuring application portability. The OS and application distribution, version, edition, and patch differences hinder smooth portability. Compute virtualization has flourished, whereas the other closely associated network and storage virtualization concepts are just taking off. Building distributed applications through VM interactions invites and involves some practical difficulties.
Let's move over to containerization. All of these barriers easily contribute to the unprecedented success of the containerization idea. A container generally contains an application, and all of the application's libraries, binaries, and other dependencies are stuffed together to be presented as a comprehensive, yet compact, entity for the outside world. Containers are exceptionally lightweight, highly portable, easily and quickly provisionable, and so on. Docker containers achieve native system performance. The greatly articulated DevOps goal gets fully fulfilled through application containers. As best practice, it is recommended that every container host one application or service.
The popular Docker containerization platform has come out with an enabling engine to simplify and accelerate life cycle management of containers. There are industry-strength and open automated tools made freely available to facilitate the needs of container networking and orchestration. Thereby, producing and sustaining business-critical distributed applications is becoming easy. Business workloads are methodically containerized to be easily taken to cloud environments, and they are exposed for container crafters and composers to bring forth cloud-based software solutions and services. Precisely speaking, containers are turning out to be the most featured, favored, and fine-tuned runtime environment for IT and business services.
The concept of containerization for building software containers that bundle and sandbox software applications has been there for years at different levels. Consolidating all constituting and contributing modules together into a single software package is a grand way out for faster and error-free provisioning, deployment, and delivery of software applications. Web application development and operations teams have been extensively handling a variety of open source as well as commercial-grade containers (for instance, servlet containers). These are typically deployment and execution containers or engines that suffer from the ignominious issue of portability.
However, due to its unfailing versatility and ingenuity, the idea of containerization has picked up with a renewed verve at a different level altogether, and today it is being positioned as the game-changing OS-level virtualization paradigm in achieving the elusive goal of software portability. There are a few OS-level virtualization technologies on different operating environments (For example on Linux such as OpenVZ and Linux-VServer, FreeBSD jails, AIX workload partitions, and Solaris containers). Containers have the innate potential to transform the way we run and scale applications systematically. They isolate and encapsulate our applications and platforms from the host system. A container is typically an OS within your host OS, and you can install and run applications on it. For all practical purposes, a container behaves like a virtual machine (VM), and especially Linux-centric containers are greatly enabled by various kernel-level features (cgroups and namespaces).
Because applications are logically isolated, users can run multiple versions of PHP, Python, Ruby, Apache software, and so on coexisting in a host and cleanly tucked away in their containers. Applications are methodically being containerized using the sandbox aspect (a subtle and smart isolation technique) to eliminate all kinds of constricting dependencies. A typical present-day container contains not only an application but also its dependencies and binaries to make it self-contained to run everywhere (laptop, on-premise systems as well as off-premise systems), without any code tweaking and twisting. Such comprehensive and compact sandboxed applications within containers are being prescribed as the most sought-after way forward to fully achieve the elusive needs of IT agility, portability, extensibility, scalability, maneuverability, and security.
In a cloud environment, various workloads encapsulated inside containers can easily be moved across systems—even across cloud centers—and cloned, backed up, and deployed rapidly. Live-in modernization and migration of applications is possible with containerization-backed application portability. The era of adaptive, instant-on, and mission-critical application containers that elegantly embed software applications inside itself is set to flourish in the days to come, with a stream of powerful technological advancements.
However, containers are hugely complicated and not user-friendly. Following the realization of the fact that several complexities were coming in the way of massively producing and fluently using containers, an open source project was initiated, with the goal of deriving a sophisticated and modular platform that consists of an enabling engine for simplifying and streamlining containers' life cycles. In other words, the Docker platform was built to automate crafting, packaging, shipping, deployment, and delivery of any software application embedded in a lightweight, extensible, and self-sufficient container that can run virtually anywhere. Docker is being considered as the most flexible and futuristic containerization technology in realizing highly competent and enterprise-class distributed applications. This is meant to make deft and decisive impacts, as the brewing trend in the IT industry is that instead of large, monolithic applications distributed on a single physical or virtual server, companies are building smaller, self-defined, sustainable, easily manageable, and discrete applications.
Docker is emerging as an open solution for developers and system administrators to quickly develop, ship, and run any applications. It lets you quickly assemble applications from disparate and distributed components, and eliminates any kind of friction that could come when shipping code. Docker, through a host of tools, simplifies the isolation of individual processes and makes them self-sustainable by running them in lightweight containers. It essentially isolates the "process" in question from the rest of your system, rather than adding a thick layer of virtualization overhead between the host and the guest. As a result, Docker containers run much faster, and its images can be shared easily for bigger and better application containers. The Docker-inspired container technology takes the concept of declarative resource provisioning a step further to bring in the critical aspects of simplification, industrialization, standardization, and automation.
The Docker platform architecture comprises several modules, tools, and technologies in order to bring in the originally envisaged benefits of the blossoming journey of containerization. The Docker idea is growing by leaps and bounds, as the open source community is putting in enormous efforts and contributing in plenty in order to make Docker prominent and dominant in production environments.
Docker containers internally use a standardized execution environment called Libcontainer, which is an interface for various Linux kernel isolation features, such as namespaces and cgroups. This architecture allows multiple containers to be run in complete isolation from one another while optimally sharing the same Linux kernel.
A Docker client doesn't communicate directly with the running containers. Instead, it communicates with the Docker daemon via the venerable sockets, or through a RESTful API. The daemon communicates directly with the containers running on the host. The Docker client can either be installed local to the daemon or on a different host altogether. The Docker daemon runs as root and is capable of orchestrating all the containers running on the host.
A Docker image is the prime build component for any Docker container. It is a read-only template from which one or more container instances can be launched programmatically. Just as virtual machines are based on images, Docker Containers are also based on Docker images. These images are typically tiny compared to VM images, and are conveniently stackable thanks to the property of AUFS, a prominent Docker filesystem. Of course, there are other filesystems being supported by the Docker platform. Docker Images can be exchanged with others and versioned like source code in private or public Docker registries.
Registries are used to store images and can be local or remote. When a container gets launched, Docker first searches in the local registry for the launched image. If it is not found locally, then it searches in the public remote registry (DockerHub). If the image is there, Docker downloads it on the local registry and uses it to launch the container. This makes it easy to distribute images efficiently and securely. One of Docker's killer features is the ability to find, download, and start container images created by other developers quickly.
In addition to the various base images, which you can use to create your own Docker containers, the public Docker registry features images of ready-to-run software, including databases, middleware solutions, content management systems, development environments, web servers, and so on. While the Docker command-line client searches in the public registry by default, it is also possible to maintain private registries. This is a great option for distributing images with proprietary code or components internally in your company. Pushing images to the registry is just as easy as downloading images from them. Docker Inc.'s registry has a web-based interface for searching, reading, commenting, and recommending (also known as starring) images.
A Docker container is a running instance of a Docker image. You can create Docker images from a running container as well. For example, you can launch a container, install a bunch of software packages using a package manager such as APT or yum, and then commit those changes to a new Docker image. But a more powerful and extensible way of creating Docker images is through a Dockerfile, the widely used build-mechanism of Docker images through declarations. The Dockerfile mechanism gets introduced to automate the build process. You can put a Dockerfile under version control and have a perfectly repeatable way of creating images.
Linux has offered a surfeit of ways to containerize applications, from its own Linux containers (LXC) to infrastructure-based technologies. Due to the tight integration, a few critical drawbacks are frequently associated with Linux containers. Libcontainer is, therefore, the result of collaborative and concerted attempts by many established vendors to standardize the way applications are packed up, delivered, and run in isolation.
Libcontainer provides a standard interface to make sandboxes or containers inside an OS. With it, a container can interface in a predictable way with the host OS resources. This enables the application inside it to be controlled as expected. Libcontainer came into being as an important ingredient of the Docker platform, and uses kernel-level namespaces to isolate the container from the host. The user namespace separates the container's and host's user databases. This succinct arrangement ensures that the container's root user does not have root privileges on the host. The process namespace is responsible for displaying and managing processes running only in the container, not the host. And the network namespace provides the container with its own network device and IP address.
Control Groups (cgroups) is another important contributor. While namespaces are responsible for isolation of the host from the container, control groups implement resource monitoring, accounting, and auditing. While allowing Docker to limit the resources being consumed by a container, such as memory, disk space, and I/O, cgroups also outputs a lot of useful metrics about these resources. These metrics allow Docker to monitor resource consumption of the various processes within the containers, and make sure that each of them gets only its fair share of the available resources.
Let's move on to Docker Filesystem now. In a traditional Linux boot, the kernel first mounts the root filesystem (rootfs) as read-only, checks its integrity, and then switches the whole rootfs volume to read-write mode. When Docker mounts rootfs, it starts as read-only, but instead of changing the filesystem to read-write mode, it takes advantage of a union mount to add a read-write filesystem over the read-only file system. In fact, there may be multiple read-only filesystems stacked on top of each other. Each of these filesystems is considered as a layer.
Docker has been using AuFS (short for Advanced union File system) as a filesystem for containers. When a process needs to modify a file, AuFS creates a copy of that file and is capable of merging multiple layers into a single representation of a filesystem. This process is called copy-on-write. The really cool thing is that AuFS allows Docker to use certain images as the basis for containers. That is, only one copy of the image is required to be stored, resulting in saving of storage and memory, as well as faster deployment of containers. AuFS controls the version of container images, and each new version is simply a difference of changes from the previous version, effectively keeping image files to a minimum.
Container orchestration is an important ingredient for sustaining the journey of containerization towards the days of smarter containers. Docker has added a few commands to facilitate safe and secure networking of local as well as remote containers. There are orchestration tools and tricks emerging and evolving in order to establish the much-needed dynamic and seamless linkage between containers, and thus produce composite containers that are more aligned and tuned for specific IT as well as business purposes. Nested containers are bound to flourish in the days to come, with continued focus on composition technologies. There are additional composition services being added by the Docker team in order to make container orchestration pervasive.
These are the direct benefits of using Docker containers:
Docker harnesses some powerful kernel-level technologies smartly and provides a simple tool set and a unified API for managing the namespace, cgroups, and a copy-on-write filesystem. The guys at Docker Inc. has created a compact tool that is greater than the sum of all its parts. The result is a potential game changer for DevOps, system administrators, and developers. Docker provides tools to make creating and working with containers as easy as possible. Containers sandbox processes from each other. Docker does a nice job of harnessing the benefits of containerization for a focused purpose, namely lightweight packaging and deployment of applications. Precisely speaking, virtualization and other associated technologies coexist with containerization to ensure software-defined, self-servicing, and composite clouds.
Further resources on this subject: