Visualizing my Social Graph with d3.js

Exclusive offer: get 50% off this eBook here
Practical Data Analysis

Practical Data Analysis — Save 50%

Transform, model, and visualize your data through hands-on projects, developed in open source tools with this book and ebook

$29.99    $15.00
by Hector Cuesta | October 2013 | Open Source

In this article written by Hector Cuesta, the author of Practical Data Analysis, you will learn how to create a social graph visualization of your Facebook Friends with d3.js.

(For more resources related to this topic, see here.)

The Social Networks Analysis

Social Networks Analysis (SNA) is not new, sociologists have been using it for a long time to study human relationships (sociometry), to find communities and to simulate how information or a disease is spread in a population.

With the rise of social networking sites such as Facebook, Twitter, LinkedIn, and so on. The acquisition of large amounts of social network data is easier. We can use SNA to get insight about customer behavior or unknown communities. It is important to say that this is not a trivial task and we will come across sparse data and a lot of noise (meaningless data). We need to understand how to distinguish between false correlation and causation. A good start is by knowing our graph through visualization and statistical analysis.

Social networking sites bring us the opportunities to ask questions that otherwise are too hard to approach, because polling enough people is time-consuming and expensive.

In this article, we will obtain our social network's graph from Facebook (FB) website in order to visualize the relationships between our friends. Finally we will create an interactive visualization of our graph using D3.js.

Getting ready

The easiest method to get our friends list is by using a third-party application. Netvizz is a Facebook app developed by Bernhard Rieder, which allows exporting social graph data to gdf and tab formats. Netvizz may export information about our friends such as gender, age, locale, posts, and likes.

In order to get our social graph from Netvizz we need to access the link below and giving access to your Facebook profile.

https://apps.facebook.com/netvizz/

As is shown in the following screenshot, we will create a gdf file from our personal friend network by clicking on the link named here in the Step 2.

Then we will download the GDF (Graph Modeling Language) file. Netvizz will give us the number of nodes and edges (links); finally we will click on the gdf file link, as we can see in the following screenshot:

The output file myFacebookNet.gdf will look like this:

nodedef>name VARCHAR,label VARCHAR,gender VARCHAR,locale VARCHAR,agerank
INT
23917067,Jorge,male,en_US,106
23931909,Haruna,female,en_US,105
35702006,Joseph,male,en_US,104
503839109,Damian,male,en_US,103
532735006,Isaac,male,es_LA,102
. . .
edgedef>node1 VARCHAR,node2 VARCHAR
23917067,35702006
23917067,629395837
23917067,747343482
23917067,755605075
23917067,1186286815
. . .

In the following screenshot we may see the visualization of the graph (106 nodes and 279 links). The nodes represent my friends and the links represent how my friends are connected between them.

Transforming GDF to JSON

In order to work with the graph in the web with d3.js, we need to transform our gdf file to json format.

  1. Firstly, we need to import the libraries numpy and json.

    import numpy as np
    import json

  2. The numpy function, genfromtxt, will obtain only the ID and name from the nodes.csv file using the usecols attribute in the 'object' format.

    nodes = np.genfromtxt("nodes.csv",
    dtype='object',
    delimiter=',',
    skip_header=1,
    usecols=(0,1))

  3. Then, the numpy function, genfromtxt, will obtain links with the source node and target node from the links.csv file using the usecols attribute in the 'object' format.

    links = np.genfromtxt("links.csv",
    dtype='object',
    delimiter=',',
    skip_header=1,
    usecols=(0,1))

    The JSON format used in the D3.js Force Layout graph implemented in this article requires transforming the ID (for example, 100001448673085) into a numerical position in the list of nodes.

  4. Then, we need to look for each appearance of the ID in the links and replace them by their position in the list of nodes.

    for n in range(len(nodes)):
    for ls in range(len(links)):
    if nodes[n][0] == links[ls][0]:
    links[ls][0] = n
    if nodes[n][0] == links[ls][1]:
    links[ls][1] = n

  5. Now, we need to create a dictionary "data" to store the JSON file.

    data ={}

  6. Next, we need to create a list of nodes with the names of the friends in the format as follows:

    "nodes": [{"name": "X"},{"name": "Y"},. . .] and add it to the
    data dictionary.
    lst = []
    for x in nodes:
    d = {}
    d["name"] = str(x[1]).replace("b'","").replace("'","")
    lst.append(d)
    data["nodes"] = lst

  7. Now, we need to create a list of links with the source and target in the format as follows:

    "links": [{"source": 0, "target": 2},{"source": 1, "target":
    2},. . .] and add it to the data dictionary.
    lnks = []
    for ls in links:
    d = {}
    d["source"] = ls[0]
    d["target"] = ls[1]
    lnks.append(d)
    data["links"] = lnks

  8. Finally, we need to create the file, newJson.json, and write the data dictionary in the file with the function dumps of the json library.

    with open("newJson.json","w") as f:
    f.write(json.dumps(data))

The file newJson.json will look as follows:

{"nodes": [{"name": "Jorge"},
{"name": "Haruna"},
{"name": "Joseph"},
{"name": "Damian"},
{"name": "Isaac"},
. . .],
"links": [{"source": 0, "target": 2},
{"source": 0, "target": 12},
{"source": 0, "target": 20},
{"source": 0, "target": 23},
{"source": 0, "target": 31},
. . .]}

Graph visualization with D3.js

D3.js provides us with the d3.layout.force() function that use the Force Atlas layout algorithm and help us to visualize our graph.

  1. First, we need to define the CSS style for the nodes, links, and node labels.

    <style>
    .link {
    fill: none;
    stroke: #666;
    stroke-width: 1.5px;
    }
    .node circle
    {
    fill: steelblue;
    stroke: #fff;
    stroke-width: 1.5px;
    }
    .node text
    {
    pointer-events: none;
    font: 10px sans-serif;
    }
    </style>

  2. Then, we need to refer the d3js library.

    <script src = "http://d3js.org/d3.v3.min.js"></script>

  3. Then, we need to define the width and height parameters for the svg container and include into the body tag.

    var width = 1100,
    height = 800
    var svg = d3.select("body").append("svg")
    .attr("width", width)
    .attr("height", height);

  4. Now, we define the properties of the force layout such as gravity, distance, and size.

    var force = d3.layout.force()
    .gravity(.05)
    .distance(150)
    .charge(-100)
    .size([width, height]);

  5. Then, we need to acquire the data of the graph using the JSON format. We will configure the parameters for nodes and links.

    d3.json("newJson.json", function(error, json) {
    force
    .nodes(json.nodes)
    .links(json.links)
    .start();

    For a complete reference about the d3js Force Layout implementation, visit the link https://github.com/mbostock/d3/wiki/Force-Layout.

  6. Then, we define the links as lines from the json data.

    var link = svg.selectAll(".link")
    .data(json.links)
    .enter().append("line")
    .attr("class", "link");
    var node = svg.selectAll(".node")
    .data(json.nodes)
    .enter().append("g")
    .attr("class", "node")
    .call(force.drag);

  7. Now, we define the node as circles of size 6 and include the labels of each node.

    node.append("circle")
    .attr("r", 6);
    node.append("text")
    .attr("dx", 12)
    .attr("dy", ".35em")
    .text(function(d) { return d.name });

  8. Finally, with the function, tick, run step-by-step the force layout simulation.

    force.on("tick", function()
    {
    link.attr("x1", function(d) { return d.source.x; })
    .attr("y1", function(d) { return d.source.y; })
    .attr("x2", function(d) { return d.target.x; })
    .attr("y2", function(d) { return d.target.y; });
    node.attr("transform", function(d)
    {
    return "translate(" + d.x + "," + d.y + ")";
    })
    });
    });
    </script>

In the image below we can see the result of the visualization. In order to run the visualization we just need to open a Command Terminal and run the following Python command or any other web server.

>>python –m http.server 8000

Then you just need to open a web browser and type the direction http://localhost:8000/ForceGraph.html. In the HTML page we can see our Facebook graph with a gravity effect and we can interactively drag-and-drop the nodes.

All the codes and datasets of this article may be found in the author github repository in the link below.https://github.com/hmcuesta/PDA_Book/tree/master/Chapter10

Summary

In this article we developed our own social graph visualization tool with D3js, transforming the data obtained from Netvizz with GDF format into JSON.

Resources for Article:


Further resources on this subject:


Practical Data Analysis Transform, model, and visualize your data through hands-on projects, developed in open source tools with this book and ebook
Published: October 2013
eBook Price: $29.99
Book Price: $49.99
See more
Select your format and quantity:

About the Author :


Hector Cuesta

Hector Cuesta holds a B.A in Informatics and M.Sc. in Computer Science. He provides consulting services for software engineering and data analysis with experience in a variety of industries including financial services, social networking, e-learning, and human resources.

He is a lecturer in the Department of Computer Science at the Autonomous University of Mexico State (UAEM). His main research interests lie in computational epidemiology, machine learning, computer vision, high-performance computing, big data, simulation, and data visualization.

He helped in the technical review of the books, Raspberry Pi Networking Cookbook by Rick Golden and Hadoop Operations and Cluster Management Cookbook by Shumin Guo for Packt Publishing. He is also a columnist at Software Guru magazine and he has published several scientific papers in international journals and conferences. He is an enthusiast of Lego Robotics and Raspberry Pi in his spare time.

You can follow him on Twitter at https://twitter.com/hmCuesta.

Books From Packt


Clojure Data Analysis Cookbook
Clojure Data Analysis Cookbook

Statistical Analysis with R
Statistical Analysis with R

Game Data Analysis – Tools and Methods
Game Data Analysis – Tools and Methods

Learning Geospatial Analysis with Python
Learning Geospatial Analysis with Python

Hadoop Operations and Cluster Management Cookbook
Hadoop Operations and Cluster Management Cookbook

KNIME Essentials
KNIME Essentials

Practical Data Analysis and Reporting with BIRT
Practical Data Analysis and Reporting with BIRT

Building Machine Learning Systems with Python
Building Machine Learning Systems with Python


No votes yet

Post new comment

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
h
a
t
d
Z
m
Enter the code without spaces and pay attention to upper/lower case.
Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software