How to Build 12 Factor Microservices on Docker - Part 1

Cody A. Ray

June 26th, 2015

As companies continue to reap benefits of the cloud beyond cost savings, devops teams are gradually transforming their infrastructure into a self-serve platform. Critical to this effort is designing applications to be cloud-native and antifragile. In this post series, we will examine the 12 factor methodology for application design, how this design approach interfaces with some of the more popular Platform-as-a-Service (PaaS) providers, and demonstrate how to run such microservices on the Deis PaaS.

What began as Service Oriented Architectures in the data center are realizing their full potential as microservices in the cloud, led by innovators such as Netflix and Heroku. Netflix was arguably the first to design their applications to not only be resilient but to be antifragile; that is, by intentionally introducing chaos into their systems, their applications become more stable, scalable, and graceful in the presence of errors. Similarly, by helping thousands of clients building cloud applications, Heroku recognized a set of common patterns emerging and set forth the 12 factor methodology.

ANTIFRAGILITY

You may have never heard of antifragility. This concept was introduced by Nassim Taleb, the author of Fooled by Randomness and The Black Swan. Essentially, antifragility is what gains from volatility and uncertainty (up to a point). Think of the MySQL server that everyone is afraid to touch lest it crash vs the Cassandra ring which can handle the loss of multiple servers without a problem. In terms more familiar to the tech crowd, a “pet” is fragile while “cattle” are antifragile (or at least robust, that is, they neither gain nor lose from volatility).

Adrian Cockroft seems to have discovered this concept with his team at Netflix. During their transition from a data center to Amazon Web Services, they claimed that “the best way to avoid failure is to fail constantly.” (http://techblog.netflix.com/2010/12/5-lessons-weve-learned-using-aws.html) To facilitate this process, one of the first tools Netflix built was Chaos Monkey, the now-infamous tool which kills your Amazon instances to see if and how well your application responds. By constantly injecting failure, their engineers were forced to design their applications to be more fault tolerant, to degrade gracefully, and to be better distributed so as to avoid any Single Points Of Failure (SPOF). As a result, Netflix has a whole suite of tools which form the Netflix PaaS. Many of these have been released as part of the Netflix OSS ecosystem.

12 FACTOR APPS

Because many companies want to avoid relying too heavily on tools from any single third-party, it may be more beneficial to look at the concepts underlying such a cloud-native design. This will also help you evaluate and compare multiple options for solving the core issues at hand. Heroku, being a platform on which thousands or millions of applications are deployed, have had to isolate the core design patterns for applications which operate in the cloud and provide an environment which makes such applications easy to build and maintain. These are described as a manifesto entitled the 12-Factor App.

The first part of this post walks through the first five factors and reworks a simple python webapp with them in mind. Part 2 continues with the remaining seven factors, demonstrating how this design allows easier integration with cloud-native containerization technologies like Docker and Deis.

Let’s say we’re starting with a minimal python application which simply provides a way to view some content from a relational database. We’ll start with a single-file application, app.py.

from flask import Flask
import mysql.connector as db
import json

app = Flask(__name__)

def execute(query):
   con = None
   try:
       con = db.connect(host='localhost', user='testdb', password='t123', database='testdb')
       cur = con.cursor()
       cur.execute(query)
       return cur.fetchall()
   except db.Error, e:
       print "Error %d: %s" % (e.args[0], e.args[1])
       return None
   finally:
       if con:
           con.close()

def list_users():
   users = execute("SELECT id, username, email FROM users") or []
   return [{"id": user_id, "username": username, "email": email} for (user_id, username, email) in users]

@app.route("/users")
def users_index():
   return json.dumps(list_users())

if __name__ == "__main__":
   app.run(host='0.0.0.0', port=5000, debug=True)

We can assume you have a simple mysql database setup already.

CREATE DATABASE testdb;
CREATE TABLE users (
           id INT NOT NULL AUTO_INCREMENT,
           username VARCHAR(80) NOT NULL,
           email VARCHAR(120) NOT NULL,
           PRIMARY KEY (id),
           UNIQUE INDEX (username),
           UNIQUE INDEX (email)
);
INSERT INTO users VALUES (1, "admin", "admin@example.com");
INSERT INTO users VALUES (2, "guest", "guest@example.com");

As you can see, the application is currently implemented as about the most naive approach possible and contained within this single file.

We’ll now walk step-by-step through the 12 Factors and apply them to this simple application.

THE 12 FACTORS: STEP BY STEP

  1. Codebase. A 12-factor app is always tracked in a version control system, such as Git, Mercurial, or Subversion. If there are multiple codebases, its a distributed system in which each component may be a 12-factor app. There are many deploys, or running instances, of each application, including production, staging, and developers' local environments.

    Since many people are familiar with git today, let’s choose that as our version control system. We can initialize a git repo for our new project.

    First ensure we’re in the app directory which, at this point, only contains the single app.py file.

    cd 12factor
    git init .

    After adding the single app.py file, we can commit to the repo.

    git add app.py
    git commit -m "Initial commit"
  2. Dependencies. All dependencies must be explicitly declared and isolated. A 12-factor app never depends on packages to be installed system-wide and uses a dependency isolation tool during execution to stop any system-wide packages from “leaking in.” Good examples are Gem Bundler for Ruby (Gemfile provides declaration and `bundle exec` provides isolation) and Pip/requirements.txt and Virtualenv for Python (where pip/requirements.txt provides declaration and `virtualenv --no-site-packages` provides isolation).

    We can create and use (source) a virtualenv environment which explicitly isolates the local app’s environment from the global “site-packages” installations.

    virtualenv env --no-site-packages
    source env/bin/activate

    A quick glance at the code we’ll show that we’re only using two dependencies currently, flask and mysql-connector-python, so we’ll add them to the requirements file.

    echo flask==0.10.1 >> requirements.txt
    echo mysql-python==1.2.5 >> requirements.txt

    Let’s use the requirements file to install all the dependencies into our isolated virtualenv.

    pip install -r requirements.txt
  3. Config. An app’s config must be stored in environment variables. This config is what may vary between deploys in developer environments, staging, and production. The most common example is the database credentials or resource handle.

    We currently have the host, user, password, and database name hardcoded. Hopefully you’ve at least already extracted this to a configuration file; either way, we’ll be moving them to environment variables instead.

    import os
    
    DATABASE_CREDENTIALS = {
       'host': os.environ['DATABASE_HOST'],
       'user': os.environ['DATABASE_USER'],
       'password': os.environ['DATABASE_PASSWORD'],
       'database': os.environ['DATABASE_NAME']
    }

    Don’t forget to update the actual connection to use the new credentials object:

    con = db.connect(**DATABASE_CREDENTIALS)
  4. Backing Services. A 12-factor app must make no distinction between a service running locally or as a third-party. For example, a deploy should be able to swap out a local MySQL database with a third-party replacement such as Amazon RDS without any code changes, just by updating a URL or other handle/credentials inside the config.

    Using a database abstraction layer such as SQLAlchemy (or your own adapter) lets you treat many backing services similarly so that you can switch between them with a single configuration parameter. In this case, it has the added advantage of serving as an Object Relational Mapper to better encapsulate our database access logic.

    We can replace the hand-rolled execute function and SELECT query with a model object

    from flask.ext.sqlalchemy import SQLAlchemy
    
    app = Flask(__name__)
    app.config['SQLALCHEMY_DATABASE_URI'] = os.environ['DATABASE_URL']
    db = SQLAlchemy(app)
    
    class User(db.Model):
       __tablename__ = 'users'
       id = db.Column(db.Integer, primary_key=True)
       username = db.Column(db.String(80), unique=True)
       email = db.Column(db.String(120), unique=True)
    
       def __init__(self, username, email):
           self.username = username
           self.email = email
    
       def __repr__(self):
           return '<User %r>' % self.username
    
    @app.route("/users")
    def users_index():
       to_json = lambda user: {"id": user.id, "name": user.username, "email": user.email}
       return json.dumps([to_json(user) for user in User.query.all()])

    Now we set the DATABASE_URL environment property to something like

    export DATABASE_URL=mysql://testdb:t123@localhost/testdb

    But its should be easy to switch to Postgres or Amazon RDS (still backed by MySQL).

    DATABASE_URL=postgresql://testdb:t123@localhost/testdb

    We’ll continue this demo using a MySQL cluster provided by Amazon RDS.

    DATABASE_URL=mysql://sa:mypwd@mydbinstance.abcdefghijkl.us-west-2.rds.amazonaws.com/mydb

    As you can see, this makes attaching and detaching from different backing services trivial from a code perspective, allowing you to focus on more challenging issues. This is important during the early stages of code because it allows you to performance test multiple databases and third-party providers against one another, and in general keeps with the notion of avoiding vendor lock-in.

 In Part 2, we’ll continue reworking this application so that it fully conforms to the 12 Factors. The remaining eight factors concern the overall application design and how it interacts with the execution environment in which its operated. We’ll assume that we’re operating the app in a multi-container Docker environment. This container-up approach provides the most flexibility and control over your execution environment. We’ll then conclude the article by deploying our application to Deis, a vertically integrated Docker-based PaaS, to demonstrate the tradeoff of configuration vs convention in selecting your own PaaS.

For more Docker content - from tutorials to books and video courses - visit our Docker page.

From 4th to 10th April, save 50% on some of our very best guides to all things cloud. Find them here! 

About the Author

Cody A. Ray is an inquisitive, tech-savvy, entrepreneurially-spirited dude. Currently, he is a software engineer at Signal, an amazing startup in downtown Chicago, where he gets to work with a dream team that’s changing the service model underlying the Internet.

comments powered by Disqus