How to Build 12 Factor Microservices on Docker - Part 2

Cody A. Ray

June 29th, 2015

Welcome back to our how-to on Building and Running 12 Factor Microservices on Docker. In Part 1, we introduced a very simple python flask application which displayed a list of users from a relational database. Then we walked through the first four of these factors, reworking the example application to follow these guidelines.

In Part 2, we’ll be introducing a multi-container Docker setup as the execution environment for our application. We’ll continue from where we left off with the next factor, number five.

  1. Build, Release, Run. A 12-factor app strictly separates the process for transforming a codebase into a deploy into distinct build, release, and run stages. The build stage creates an executable bundle from a code repo, including vendoring dependencies and compiling binaries and asset packages. The release stage combines the executable bundle created in the build with the deploy’s current config. Releases are immutable and form an append-only ledger; consequently, each release must have a unique release ID. The run stage runs the app in the execution environment by launching the app’s processes against the release.

    This is where your operations meet your development and where a PaaS can really shine. For now, we’re assuming that we’ll be using a Docker-based containerized deploy strategy. We’ll start by writing a simple Dockerfile.

    The Dockerfile starts with an ubuntu base image and then I add myself as the maintainer of this app.

    FROM ubuntu:14.04.2
    MAINTAINER codyaray

    Before installing anything, let’s make sure that apt has the latest versions of all the packages.

    RUN echo "deb http://archive.ubuntu.com/ubuntu/ $(lsb_release -sc) main universe" >> /etc/apt/sources.list
    RUN apt-get update
    

    Install some basic tools and the requirements for running a python webapp

    RUN apt-get install -y tar curl wget dialog net-tools build-essential
    RUN apt-get install -y python python-dev python-distribute python-pip
    RUN apt-get install -y libmysqlclient-dev

    Copy over the application to the container.

    ADD /. /src

    Install the dependencies.

    RUN pip install -r /src/requirements.txt

    Finally, set the current working directory, expose the port, and set the default command.

    EXPOSE 5000
    WORKDIR /src
    CMD python app.py
    

    Now, the build phase consists of building a docker image. You can build and store locally with

    docker build -t codyaray/12factor:0.1.0 .

    If you look at your local repository, you should see the new image present.

    $ docker images
    REPOSITORY          TAG     IMAGE ID         CREATED       VIRTUAL SIZE
    codyaray/12factor   0.1.0   bfb61d2bbb17     1 hour ago    454.8 MB
    

    The release phase really depends on details of the execution environment. You’ll notice that none of the configuration is stored in the image produced from the build stage; however, we need a way to build a versioned release with the full configuration as well.

    Ideally, the execution environment would be responsible for creating releases from the source code and configuration specific to that environment. However, if we’re working from first principles with Docker rather than a full-featured PaaS, one possibility is to build a new docker image using the one we just built as a base. Each environment would have its own set of configuration parameters and thus its own Dockerfile. It could be something as simple as

    FROM codyaray/12factor:0.1.0
    MAINTAINER codyaray
    
    ENV DATABASE_URL mysql://sa:mypwd@mydbinstance.abcdefghijkl.us-west-2.rds.amazonaws.com/mydb

    This is simple enough to be programmatically generated given the environment-specific configuration and the new container version to be deployed. For the demonstration purposes, though, we’ll call the above file Dockerfile-release so it doesn’t conflict with the main application’s Dockerfile. Then we can build it with

    docker build -f Dockerfile-release -t codyaray/12factor-release:0.1.0.0 .

    The resulting built image could be stored in the environment’s registry as codyaray/12factor-release:0.1.0.0. The images in this registry would serve as the immutable ledger of releases. Notice that the version has been extended to include a fourth level which, in this instance, could represent configuration version “0” applied to source version “0.1.0”.

    The key here is that these configuration parameters aren’t collated into named groups (sometimes called “environments”). For example, these aren’t static files named like Dockerfile.staging or Dockerfile.dev in a centralized repo. Rather, the set of parameters is distributed so that each environment maintains its own environment mapping in some fashion. The deployment system would be setup such that a new release to the environment automatically applies the environment variables it has stored to create a new Docker image.

    As always, the final deploy stage depends on whether you’re using a cluster manager, scheduler, etc. If you’re using standalone Docker, then it would boil down to

    docker run -P -t codyaray/12factor-release:0.1.0.0
  2. Processes. A 12-factor app is executed as one or more stateless processes which share nothing and are horizontally partitionable. All data which needs to be stored must use a stateful backing service, usually a database. This means no sticky sessions and no in-memory or local disk-based caches. These processes should never daemonize or write their own PID files; rather, they should rely on the execution environment’s process manager (such as Upstart).

    This factor must be considered up-front, in line with the discussions on antifragility, horizontal scaling, and overall application design. As the example app delegates all stateful persistence to a database, we’ve already succeeded on this point.

    However, it is good to note that a number of issues have been found using the standard ubuntu base image for Docker, one of which is its process management (or lack thereof). If you would like to use a process manager to automatically restart crashed daemons, or to notify a service registry or operations team, check out baseimage-docker. This image adds runit for process supervision and management, amongst other improvements to base ubuntu for use in Docker such as obsoleting the need for pid files.

    To use this new image, we have to update the Dockerfile to set the new base image and use its init system instead of running our application as the root process in the container.

    FROM phusion/baseimage:0.9.16
    MAINTAINER codyaray
    
    RUN echo "deb http://archive.ubuntu.com/ubuntu/ $(lsb_release -sc) main universe" >> /etc/apt/sources.list
    RUN apt-get update
    
    RUN apt-get install -y tar git curl nano wget dialog net-tools build-essential
    RUN apt-get install -y python python-dev python-distribute python-pip
    RUN apt-get install -y libmysqlclient-dev
    
    ADD /. /src
    
    RUN pip install -r /src/requirements.txt
    
    EXPOSE 5000
    
    WORKDIR /src
    
    RUN mkdir /etc/service/12factor
    ADD 12factor.sh /etc/service/12factor/run
    
    # Use baseimage-docker's init system.
    CMD ["/sbin/my_init"]

     Notice the file 12factor.sh that we’re now adding to /etc/service. This is how we instruct runit to run our application as a service.

    Let’s add the new 12factor.sh file.

    #!/bin/sh
    python /src/app.py 

    Now the new containers we deploy will attempt to be a little more fault-tolerant by using an OS-level process manager.

  3. Port Binding. A 12-factor app must be self-contained and bind to a port specified as an environment variable. It can’t rely on the injection of a web container such as tomcat or unicorn; instead it must embed a server such as jetty or thin. The execution environment is responsible for routing requests from a public-facing hostname to the port-bound web process.

    This is trivial with most embedded web servers. If you’re currently using an external web server, this may require more effort to support an embedded server within your application. For the example python app (which uses the built-in flask web server), it boils down to

    port = int(os.environ.get("PORT", 5000))
    app.run(host='0.0.0.0', port=port)
    

    Now the execution environment is free to instruct the application to listen on whatever port is available. This obviates the need for the application to tell the environment what ports must be exposed, as we’ve been required to do with Docker.

  4. Concurrency. Because a 12-factor exclusively uses stateless processes, it can scale out by adding processes. A 12-factor app can have multiple process types, such as web processes, background worker processes, or clock processes (for cron-like scheduled jobs).

    As each process type is scaled independently, each logical process would become its own Docker container as well. We’ve already seen building a web process; other processes are very similar. In most cases, scaling out simply means launching more instances of the container. (Its usually not desirable to scale out the clock processes, though, as they often generate events that you want to be scheduled singletons within your infrastructure.)

  5. Disposability. A 12-factor app’s processes can be started or stopped (with a SIGTERM) anytime. Thus, minimizing startup time and gracefully shutting down is very important. For example, when a web service receives a SIGTERM, it should stop listening on the HTTP port, allow in-flight requests to finish, and then exit. Similar, processes should be robust against sudden death; for example, worker processes should use a robust queuing backend.

    You want to ensure the web server you select can gracefully shutdown. The is one of the trickier parts of selecting a web server, at least for many of the common python http servers that I’ve tried.

     In theory, shutting down based on receiving a SIGTERM should be as simple as follows.

    import signal
    signal.signal(signal.SIGTERM, lambda *args: server.stop(timeout=60))
    

    But often times, you’ll find that this will immediately kill the in-flight requests as well as closing the listening socket. You’ll want to test this thoroughly if dependable graceful shutdown is critical to your application.

  6. Dev/Prod Parity. A 12-factor app is designed to keep the gap between development and production small. Continuous deployment shrinks the amount of time that code lives in development but not production. A self-serve platform allows developers to deploy their own code in production, just like they do in their local development environments. Using the same backing services (databases, caches, queues, etc) in development as production reduces the number of subtle bugs that arise in inconsistencies between technologies or integrations.

    As we’re deploying this solution using fully Dockerized containers and third-party backing services, we’ve effectively achieved dev/prod parity. For local development, I use boot2docker on my Mac which provides a Docker-compatible VM to host my containers. Using boot2docker, you can start the VM and setup all the env variables automatically with

    boot2docker up
    $(boot2docker shellinit)
    

    Once you’ve initialized this VM and set the DOCKER_HOST variable to its IP address with shellinit, the docker commands given above work exactly the same for development as they do for production.

  7. Logs. Consider logs as a stream of time-ordered events collected from all running processes and backing services. A 12-factor app doesn’t concern itself with how its output is handled. Instead, it just writes its output to its `stdout` stream. The execution environment is responsible for collecting, collating, and routing this output to its final destination(s).

    Most logging frameworks either support logging to stderr/stdout by default or easily switching from file-based logging to one of these streams. In a 12-factor app, the execution environment is expected to capture these streams and handle them however the platform dictates.

    Because our app doesn’t have specific logging yet, and the only logs are from flask and already to stderr, we don’t have any application changes to make. 

    However, we can show how an execution environment which could be used handle the logs. We’ll setup a Docker container which collects the logs from all the other docker containers on the same host. Ideally, this would then forward the logs to a centralized service such as Elasticsearch. Here we’ll demo using Fluentd to capture and collect the logs inside the log collection container; a simple configuration change would allow us to switch from writing these logs to disk as we demo here and instead send them from Fluentd to a local Elasticsearch cluster.

    We’ll create a Dockerfile for our new logcollector container type. For more detail, you can find a Docker fluent tutorial here. We can call this file Dockerfile-logcollector.

    FROM kiyoto/fluentd:0.10.56-2.1.1
    MAINTAINER kiyoto@treasure-data.com
    RUN mkdir /etc/fluent
    ADD fluent.conf /etc/fluent/
    CMD "/usr/local/bin/fluentd -c /etc/fluent/fluent.conf"
    

    We use an existing fluentd base image with a specific fluentd configuration. Notably this tails all the log files in /var/lib/docker/containers/<container-id>/<container-id>-json.log, adds the container ID to the log message, and then writes to JSON-formatted files inside /var/log/docker.

    <source>
     type tail
     path /var/lib/docker/containers/*/*-json.log
     pos_file /var/log/fluentd-docker.pos
     time_format %Y-%m-%dT%H:%M:%S
     tag docker.*
     format json
    </source>
    <match docker.var.lib.docker.containers.*.*.log>
     type record_reformer
     container_id ${tag_parts[5]}
     tag docker.all
    </match>
    <match docker.all>
     type file
     path /var/log/docker/*.log
     format json
     include_time_key true
    </match>
    

    As usual, we create a Docker image. Don’t forget to specify the logcollector Dockerfile.

    docker build -f Dockerfile-logcollector -t codyaray/docker-fluentd .

    We’ll need to mount two directories from the Docker host into this container when we launch it. Specifically, we’ll mount the directory containing the logs from all the other containers as well as the directory to which we’ll be writing the consolidated JSON logs.

    docker run -d -v /var/lib/docker/containers:/var/lib/docker/containers -v /var/log/docker:/var/log/docker codyaray/docker-fluentd

    Now if you check in the /var/log/docker directory, you’ll see the collated JSON log files. Note that this is on the docker host rather than in any container; if you’re using boot2docker, you can ssh into the docker host with boot2docker ssh and then check /var/log/docker.

  8. Admin Processes. Any admin or management tasks for a 12-factor app should be run as one-off processes within a deploy’s execution environment. This process runs against a release using the same codebase and configs as any process in that release and uses the same dependency isolation techniques as the long-running processes.

    This is really a feature of your app's execution environment. If you’re running a Docker-like containerized solution, this may be pretty trivial.

    docker run -i -t --entrypoint /bin/bash codyaray/12factor-release:0.1.0.0

    The -i flag instructs docker to provide interactive session, that is, to keep the input and output ttys attached. Then we instruct docker to run the /bin/bash command instead of another 12factor app instance. This creates a new container based on the same docker image, which means we have access to all the code and configs for this release.

    This will drop us into a bash terminal to do whatever we want. But let’s say we want to add a new “friends” table to our database, so we wrote a migration script add_friends_table.py. We could run it as follows:

    docker run -i -t --entrypoint python codyaray/12factor-release:0.1.0.0 /src/add_friends_table.py

    As you can see, following the few simple rules specified in the 12 Factor manifesto really allows your execution environment to manage and scale your application. While this may not be the most feature-rich integration within a PaaS, it is certainly very portable with a clean separation of responsibilities between your app and its environment. Much of the tools and integration demonstrated here were a do-it-yourself container approach to the environment, which would be subsumed by an external vertically integrated PaaS such as Deis.

    If you’re not familiar with Deis, its one of several competitors in the open source platform-as-a-service space which allows you to run your own PaaS on a public or private cloud. Like many, Deis is inspired by Heroku. So instead of Dockerfiles, Deis uses a buildpack to transform a code repository into an executable image and a Procfile to specify an app’s processes. Finally, by default you can use a specialized git receiver to complete a deploy. Instead of having to manage separate build, release, and deploy stages yourself like we described above, deploying an app to Deis could be a simple as

    git push deis-prod
    

    While it can’t get much easier than this, you’re certainly trading control for simplicity. It's up to you to determine which works best for your business.

Find more Docker tutorials alongside our latest releases on our dedicated Docker page.

From 4th to the 10th April you can save 50% on some our very best cloud titles - featuring Docker, AWS, OpenStack and Azure, it's the perfect opportunity to get to grips with some of the most important software around today. Find them here.

About the Author

Cody A. Ray is an inquisitive, tech-savvy, entrepreneurially-spirited dude. Currently, he is a software engineer at Signal, an amazing startup in downtown Chicago, where he gets to work with a dream team that’s changing the service model underlying the Internet.

comments powered by Disqus