Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Mastering Clojure Data Analysis

You're reading from  Mastering Clojure Data Analysis

Product type Book
Published in May 2014
Publisher
ISBN-13 9781783284139
Pages 340 pages
Edition 1st Edition
Languages
Author (1):
Eric Richard Rochester Eric Richard Rochester
Profile icon Eric Richard Rochester

Table of Contents (17) Chapters

Mastering Clojure Data Analysis
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Network Analysis – The Six Degrees of Kevin Bacon GIS Analysis – Mapping Climate Change Topic Modeling – Changing Concerns in the State of the Union Addresses Classifying UFO Sightings Benford's Law – Detecting Natural Progressions of Numbers Sentiment Analysis – Categorizing Hotel Reviews Null Hypothesis Tests – Analyzing Crime Data A/B Testing – Statistical Experiments for the Web Analyzing Social Data Participation Modeling Stock Data Index

Cross-validating the results


As I've already mentioned, the dataset for this chapter is a manually coded group of 500 hotel reviews taken from the OpinRank dataset. For this experiment, we'll break these into 10 chunks of 50 reviews each.

These chunks will allow us to use K-fold cross validation to test how our system is doing. Cross validation is a way of checking your algorithm and procedures by splitting your data up into equally sized chunks. You then train your data on all of the chunks but one; that is the training set. You calculate the error after running the trained system on the validation set. Then, you use the next chunk as a validation set and start over again. Finally, we can average the error for all of the trials.

For example, the validation procedure uses four folds, A, B, C, and D. For the first run, A, B, and C would be the training set, and D would be the test set. Next, A, B, and D would be the training set, and C would be the test set. This would continue until every...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}