Part 2: Migrating a WordPress Blog to Middleman and Deploying to Amazon S3

Mike Ball

November 28th, 2014

Part 2: Migrating WordPress blog content and deploying to production

In part 1 of this series, we created middleman-demo, a basic Middleman-based blog. Part 1 addressed the benefits of a static site, setting up a Middleman development environment, Middleman’s templating system, and how to configure a Middleman project to support a basic blogging functionality.

Now that middleman-demo is configured for blogging, let’s export old content from an existing WordPress blog, compile the application for production, and deploy to a web server.

In this part, we’ll cover the following:

  • Using the wp2middleman gem to migrate content from an existing WordPress blog
  • Creating a Rake task to establish an Amazon Web Services S3 bucket
  • Deploying a Middleman blog to Amazon S3
  • Setting up a custom domain for an S3-hosted site

If you didn’t follow part 1, or you no longer have your original middleman-demo code, you can clone mine and check out the part2 branch:

$ git clone http://github.com/mdb/middleman-demo && cd middleman-demo && git checkout part2

Export your content from Wordpress

Now that middleman-demo is configured for blogging, let’s export old content from an existing Wordpress blog.

Wordpress provides a tool through which blog content can be exported as an XML file, also called a WordPress “eXtended RSS” or “WXR” file. A WXR file can be generated and downloaded via the Wordpress admin’s Tools > Export screen, as is explained in Wordpress’s WXR documentation.

In absence of a real Wordpress blog, download middleman_demo.wordpress.xml file, a sample WXR file:

$ wget www.mikeball.info/downloads/middleman_demo.wordpress.xml

Migrating the Wordpress posts to markdown

To migrate the posts contained in the Wordpress WXR file, I created wp2middleman, a command line tool to generate Middleman-style markdown files from the posts in a WXR.

Install wp2middleman via Rubygems:

$ gem install wp2middleman

wp2middleman provides a wp2mm command. Pass the middleman_demo.wordpress.xml file to the wp2mm command:

$ wp2mm middleman_demo.wordpress.xml

If all goes well, the following output is printed to the terminal:

Successfully migrated middleman_demo.wordpress.xml

wp2middleman also produced an export directory. The export directory houses the blog posts from the middleman_demo.wordpress.xml WXR file, now represented as Middleman-style markdown files:

$ ls export/
2007-02-14-Fusce-mauris-ligula-rutrum-at-tristique-at-pellentesque-quis-nisl.html.markdown
2007-07-21-Suspendisse-feugiat-enim-vel-lorem.html.markdown
2008-02-20-Suspendisse-rutrum-Suspendisse-nisi-turpis-congue-ac.html.markdown
2008-03-17-Duis-euismod-purus-ac-quam-Mauris-tortor.html.markdown
2008-04-02-Donec-cursus-tincidunt-libero-Nam-blandit.html.markdown
2008-04-28-Etiam-nulla-nisl-cursus-vel-auctor-at-mollis-a-quam.html.markdown
2008-06-08-Praesent-faucibus-ligula-luctus-dolor.html.markdown
2008-07-08-Proin-lobortis-sapien-non-venenatis-luctus.html.markdown
2008-08-08-Etiam-eu-urna-eget-dolor-imperdiet-vehicula-Phasellus-dictum-ipsum-vel-neque-mauris-interdum-iaculis-risus.html.markdown
2008-09-08-Lorem-ipsum-dolor-sit-amet-consectetuer-adipiscing-elit.html.markdown
2013-12-30-Hello-world.html.markdown

Note that wp2mm supports additional options, though these are beyond the scope of this tutorial. Read more on wp2middleman’s GitHub page.

Also note that the markdown posts in export are named *.html.markdown and some -- such as SOME EXAMPLE TODO -- contain the HTML embedded in the original Wordpress post. Middleman supports the ability to embed multiple languages within a single post file. For example, Middleman will evaluate a file named .html.erb.markdown first as markdown and then ERb. The final result would be HTML.

Move the contents of export to source/blog and remove the export directory:

$ mv export/* source/blog && rm -rf export

Now, assuming the Middleman server is running, visiting http://localhost:4567 lists all the blog posts migrated from Wordpress. Each post links to its permalink. In the case of posts with tags, each tag links to a tag page.

Compiling for production

Thus far, we’ve been viewing middleman-demo in local development, where the Middleman server dynamically generates the HTML, CSS, and JavaScript with each request. However, Middleman’s value lies in its ability to generate a static website -- simple HTML, CSS, JavaScript, and image files -- served directly by a web server such as Nginx or Apache and thus requiring no application server or internal backend.

Compile middleman-demo to a static build directory:

$ middleman build

The resulting build directory houses every HTML file that can be served by middleman-demo, as well as all necessary CSS, JavaScript, and images. Its directory layout maps to the URL patterns defined in config.rb. The build directory is typically ignored from source control.

Deploying the build to Amazon S3

Amazon Web Services is Amazon’s cloud computing platform. Amazon S3, or Simple Storage Service, is a simple data storage service. Because S3 “buckets” can be accessible over HTTP, S3 offers a great cloud-based hosting solution for static websites, such as middleman-demo.

While S3 is not free, it is generally extremely affordable. Amazon charges on a per-usage basis according to how many requests your bucket serves, including PUT requests, i.e. uploads. Read more about S3 pricing on AWS’s pricing guide.

Let’s deploy the middleman-demo build to Amazon S3.

First, sign up for AWS. Through AWS’s web-based admin, create an IAM user and locate the corresponding “access key id” and “secret access key:”

1: Visit the AWS IAM console.

2: From the navigation menu, click Users.

3: Select your IAM user name.

4: Click User Actions; then click Manage Access Keys.

5: Click Create Access Key.

6: Click Download Credentials; store the keys in a secure location.

7: Store your access key id in an environment variable named AWS_ACCESS_KEY_ID:

$ export AWS_ACCESS_KEY_ID=your_access_key_id

8: Store your secret access key as an environment variable named AWS_SECRET_ACCESS_KEY:

$ export AWS_SECRET_ACCESS_KEY=your_secret_access_key

Note that, to persist these environment variables beyond the current shell session, you may want to automatically set them in each shell session. Setting them in a file such as your ~/.bashrc ensures this:

export AWS_ACCESS_KEY_ID=your_access_key_id
export AWS_SECRET_ACCESS_KEY=your_secret_access_key

Creating an S3 bucket with Ruby

To deploy to S3, we’ll need to create a “bucket,” or an S3 endpoint to which the middleman-demo’s build directory can be deployed. This can be done via AWS’s management console, but we can also automate its creation with Ruby. We’ll use the aws-sdk Ruby gem and a Rake task to create an S3 bucket for middleman-demo.

Add the aws-sdk gem to middleman-demo’s Gemfile:

gem 'aws-sdk

Install the new gem:

$ bundle install

Create a Rakefile:

$ touch Rakefile

Add the following Ruby to the Rakefile; this code establishes a Rake task -- a quick command line utility -- to automate the creation of an S3 bucket:

require 'aws-sdk'

desc "Create an AWS S3 bucket"
task :s3_bucket, :bucket_name do |task, args|
 s3 = AWS::S3.new(region: 'us-east-1)

 bucket = s3.buckets.create(args[:bucket_name])

 bucket.configure_website do |config|
   config.index_document_suffix = 'index.html'
   config.error_document_key = 'error/index.html'
 end
end

From the command line, use the newly-established :s3_bucket Rake task to create a unique S3 bucket for your middleman-demo. Note that, if you have an existing domain you’d like to use, your bucket should be named www.yourdomain.com:

$ rake s3_bucket[some_unique_bucket_name]

For example, I named my S3 bucket www.middlemandemo.com by entering the following:

$ rake s3_bucket[www.middlemandemo.com]

After running rake s3_bucket[YOUR_BUCKET], you should see YOUR_BUCKET amongst the buckets listed in your AWS web console.

Creating an error template

Our rake task specifies a config.error_document_key whose value is error/index.html. This configures your S3 bucket to serve an error.html for erroring responses, such as 404s.

Create an source/error.html.erb template:

$ touch source/error.html.erb

And add the following:

---
title: Oops - something went wrong
---

<h2><%= current_page.data.title %></h2>

Deploying to your S3 bucket

With an S3 bucket established, the middleman-sync Ruby gem can be used to automate uploading middleman-demo builds to S3.

Add the middleman-sync gem to the Gemfile:

gem ‘middleman-sync’

Install the middleman-sync gem:

$ bundle install

Add the necessary middleman-sync configuration to config.rb:

activate :sync do |sync|
 sync.fog_provider = 'AWS'
 sync.fog_region = 'us-east-1'
 sync.fog_directory = '<YOUR_BUCKET>'
 sync.aws_access_key_id = ENV['AWS_ACCESS_KEY_ID']
 sync.aws_secret_access_key = ENV['AWS_SECRET_ACCESS_KEY']
end

Build and deploy middleman-demo:

$ middleman build && middleman sync

Note: if your deployment fails with a ’post_connection_check': hostname "YOUR_BUCKET" does not match the server certificate (OpenSSL::SSL::SSLError) (Excon::Errors::SocketError), it’s likely due to an open issue with middleman-sync. To work around this issue, add the following to the top of config.rb:

require 'fog'

Fog.credentials = { path_style: true }

Now, middlemn-demo is browseable online at http://YOUR_BUCKET.s3-website-us-east-1.amazonaws.com/

Using a custom domain

With middleman-demo -- deployed to an S3 bucket whose name matches a domain name, a custom domain can be configured easily.

To use a custom domain, log into your domain management provider and add a CNAME mapping your domain to www.yourdomain.com.s3-website-us-east-1.amazonaws.com..

While the exact process for managing a CNAME varies between domain name providers, the process is generally fairly simple. Note that your S3 bucket name must perfectly match your domain name.

Recap

We’ve examined the benefits of static site generators and covered some basics regarding Middleman blogging. We’ve learned how to use the wp2middleman gem to migrate content from a Wordpress blog, and we’ve learned how to deploy Middleman to Amazon’s cloud-based Simple Storage Service (S3).

About this author

Mike Ball is a Philadelphia-based software developer specializing in Ruby on Rails and JavaScript. He works for Comcast Interactive Media where he helps build web-based TV and video consumption applications.