Reader small image

You're reading from  Mastering Clojure Data Analysis

Product typeBook
Published inMay 2014
Reading LevelBeginner
Publisher
ISBN-139781783284139
Edition1st Edition
Languages
Right arrow
Author (1)
Eric Richard Rochester
Eric Richard Rochester
author image
Eric Richard Rochester

Eric Richard Rochester Studied medieval English literature and linguistics at UGA. Dissertated on lexicography. Now he programs in Haskell and writes. He's also a husband and parent.
Read more about Eric Richard Rochester

Right arrow

Improving the results


What could we do to improve these results?

First, we should improve the test and training sets. It would be good to have multiple raters, say, have each review independently reviewed three times and use the rating that was chosen two or three times.

Most importantly, we'd like to have a larger and better test set and training set. For this type of problem, having 500 observations is really on the low end of what you can do anything useful with, and you can expect the results to improve with more observations. However, I do need to stress on the fact that more training data doesn't necessarily imply better results. It could help, but there are no guarantees.

We could also look at improving the features. We could select them more carefully, because having too many useless or unneeded features can make the classifier perform poorly. We could also select different features such as dates or information about the informants; if we had any data on them, it might be useful.

There...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Mastering Clojure Data Analysis
Published in: May 2014Publisher: ISBN-13: 9781783284139

Author (1)

author image
Eric Richard Rochester

Eric Richard Rochester Studied medieval English literature and linguistics at UGA. Dissertated on lexicography. Now he programs in Haskell and writes. He's also a husband and parent.
Read more about Eric Richard Rochester