Reader small image

You're reading from  Mastering Clojure Data Analysis

Product typeBook
Published inMay 2014
Reading LevelBeginner
Publisher
ISBN-139781783284139
Edition1st Edition
Languages
Right arrow
Author (1)
Eric Richard Rochester
Eric Richard Rochester
author image
Eric Richard Rochester

Eric Richard Rochester Studied medieval English literature and linguistics at UGA. Dissertated on lexicography. Now he programs in Haskell and writes. He's also a husband and parent.
Read more about Eric Richard Rochester

Right arrow

Running the experiment


Remember, earlier we defined functions to break a sequence of tokens into features of various sorts: unigrams, bigrams, trigrams, and POS-tagged unigrams. We can take these and automatically test both the classifiers against all of these types of features. Let's see how.

First, we'll define some top-level variables that associate label keywords with the functions that we want to test at that point in the process (that is, classifiers or feature-generators):

(def classifiers
  {:naive-bayes a/k-fold-naive-bayes
:maxent a/k-fold-logistic})
(def feature-factories
  {:unigram t/unigrams
:bigram t/bigrams
:trigram t/trigrams
:pos (let [pos-model 
              (t/read-me-tagger "data/en-pos-maxent.bin")]
          (fn [ts] (t/with-pos pos-model ts)))})

We can now iterate over both of these hash maps and cross-validate these classifiers on these features. We'll average the error information (the precision and recall) for all of them and return the averages. Once we've executed...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Mastering Clojure Data Analysis
Published in: May 2014Publisher: ISBN-13: 9781783284139

Author (1)

author image
Eric Richard Rochester

Eric Richard Rochester Studied medieval English literature and linguistics at UGA. Dissertated on lexicography. Now he programs in Haskell and writes. He's also a husband and parent.
Read more about Eric Richard Rochester