Reader small image

You're reading from  Mastering Clojure Data Analysis

Product typeBook
Published inMay 2014
Reading LevelBeginner
Publisher
ISBN-139781783284139
Edition1st Edition
Languages
Right arrow
Author (1)
Eric Richard Rochester
Eric Richard Rochester
author image
Eric Richard Rochester

Eric Richard Rochester Studied medieval English literature and linguistics at UGA. Dissertated on lexicography. Now he programs in Haskell and writes. He's also a husband and parent.
Read more about Eric Richard Rochester

Right arrow

Getting the data


A couple of small datasets of the Facebook network data are available on the Internet. None of them are particularly large or complete, but they do give us a reasonable snapshot of part of Facebook's network. As the Facebook graph is a private data source, this partial view is probably the best that we can hope for.

We'll get the data from the Stanford Large Network Dataset Collection (http://snap.stanford.edu/data/). This contains a number of network datasets, from Facebook and Twitter, to road networks and citation networks. To do this, we'll download the facebook.tar.gz file from http://snap.stanford.edu/data/egonets-Facebook.html. Once it's on your computer, you can extract it. When I put it into the folder with my source code, it created a directory named facebook.

The directory contains 10 sets of files. Each group is based on one primary vertex (user), and each contains five files. For vertex 0, these files would be as follows:

  • 0.edges: This contains the vertices that the primary one links to.

  • 0.circles: This contains the groupings that the user has created for his or her friends.

  • 0.feat: This contains the features of the vertices that the user is adjacent to and ones that are listed in 0.edges.

  • 0.egofeat: This contains the primary user's features.

  • 0.featnames: This contains the names of the features described in 0.feat and 0.egofeat. For Facebook, these values have been anonymized.

For these purposes, we'll just use the *.edges files.

Now let's turn our attention to the data in the files and what they represent.

Previous PageNext Page
You have been reading a chapter from
Mastering Clojure Data Analysis
Published in: May 2014Publisher: ISBN-13: 9781783284139
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Eric Richard Rochester

Eric Richard Rochester Studied medieval English literature and linguistics at UGA. Dissertated on lexicography. Now he programs in Haskell and writes. He's also a husband and parent.
Read more about Eric Richard Rochester