Reader small image

You're reading from  Mastering Clojure Data Analysis

Product typeBook
Published inMay 2014
Reading LevelBeginner
Publisher
ISBN-139781783284139
Edition1st Edition
Languages
Right arrow
Author (1)
Eric Richard Rochester
Eric Richard Rochester
author image
Eric Richard Rochester

Eric Richard Rochester Studied medieval English literature and linguistics at UGA. Dissertated on lexicography. Now he programs in Haskell and writes. He's also a husband and parent.
Read more about Eric Richard Rochester

Right arrow

Getting the data


For this chapter, actually acquiring the data will be relatively easy. In other chapters, this step involves screen scraping, SPARQL, or other data extraction, munging, and cleaning techniques. For this dataset, we'll just download it from Infochimps (http://www.infochimps.com/). Infochimps is a company (and their website) devoted to Big Data and doing more with data analysis. They provide a collection of datasets that are online and freely available. To download this specific dataset, browse to http://www.infochimps.com/datasets/60000-documented-ufo-sightings-with-text-descriptions-and-metada and download the data from the link there, as shown in the following screenshot:

The data is in a ZIP-compressed file. This expands the files into the chimps_16154-2010-10-20_14-33-35 directory. This contains a file that lists metadata for the dataset as well as the data itself in several different formats. For the purposes of this chapter, we'll use the tab separated values (TSV) file...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Mastering Clojure Data Analysis
Published in: May 2014Publisher: ISBN-13: 9781783284139

Author (1)

author image
Eric Richard Rochester

Eric Richard Rochester Studied medieval English literature and linguistics at UGA. Dissertated on lexicography. Now he programs in Haskell and writes. He's also a husband and parent.
Read more about Eric Richard Rochester