Parsing dates and times
One difficult issue when normalizing and cleaning up data is how to deal with time. People enter dates and times in a bewildering variety of formats; some of them are ambiguous, and some of them are vague. However, we have to do our best to interpret them and normalize them into a standard format.
In this recipe, we'll define a function that attempts to parse a date into a standard string format. We'll use the clj-time Clojure library, which is a wrapper around the Joda Java library (http://joda-time.sourceforge.net/).
Getting ready
First, we need to declare our dependencies in the Leiningen project.clj file:
(defproject cleaning-data "0.1.0-SNAPSHOT"
:dependencies [[org.clojure/clojure "1.6.0"]
[clj-time "0.9.0-beta1"]])Then, we need to load these dependencies into our script or REPL. We'll exclude second from clj-time to keep it from clashing with clojure.core/second:
(use '[clj-time.core :exclude (extend second)]
'[clj-time.format])