Reader small image

You're reading from  Data Visualization with D3 and AngularJS

Product typeBook
Published inApr 2015
Reading LevelIntermediate
Publisher
ISBN-139781784398484
Edition1st Edition
Languages
Right arrow
Authors (2):
Erik Hanchett
Erik Hanchett
author image
Erik Hanchett

Erik Hanchett is a software developer, blogger, and perpetual student who has been writing code for over 10 years. He currently resides in Reno Nevada, with his wife and two kids. He blogs about software development at ProgramWithErik.com. I would like to thank my wife Susan for helping me stay motivated. My friend F.B. Woods for all his help on the English language and Dr. Bret Simmons for teaching me the value of a personal brand. I would also like to thank all my friends and family that encouraged me along the way.
Read more about Erik Hanchett

Christoph Körner
Christoph Körner
author image
Christoph Körner

Christoph Körner previously worked as a cloud solution architect for Microsoft, specializing in Azure-based big data and machine learning solutions, where he was responsible for designing end-to-end machine learning and data science platforms. He currently works for a large cloud provider on highly scalable distributed in-memory database services. Christoph has authored four books: Deep Learning in the Browser for Bleeding Edge Press, as well as Mastering Azure Machine Learning (first edition), Learning Responsive Data Visualization, and Data Visualization with D3 and AngularJS for Packt Publishing.
Read more about Christoph Körner

View More author details
Right arrow

Chapter 3. Manipulating Data

In this chapter, you will learn how to manipulate data in order to preprocess it for visualization and to extract statistical information.

We will start with discussing arrays and array functions in general because this is the canonical representation of data in D3.js. The presented techniques for array manipulation will form a basic toolset to extract relevant data for the visualization and to transform and adapt the structures of flat datasets.

In the following section, we will see very useful string formatting techniques. You will learn how to format numbers on one hand and dates and times on the other hand.

Then, we will discuss scales for numbers, strings, and times in order to map datasets to specific ranges, for example, to linear, logarithmic, or time ranges.

In the last section, we will see the built-in representation for axes in D3.js. With the previously seen techniques, we will be able to construct axes that automatically scale and format the data according...

Manipulating datasets in arrays


In data visualizations, we will usually not display the raw data itself, but moreover aggregate and preprocess the data beforehand. Let me give you an example. You are given the access log from a web server that stores every single visitor with his IP address and user agent (a string that contains information about the browser). Rather than plotting all these rows on its own, you may want to sum up all visitors per minute (or per day) and plot a time series histogram of these sums. Or maybe, you want to group the data by different properties, for example, plotting visitors from Europe vs visitors from North America. It is important to know that your ability of plotting this aggregated data is directly dependent on your ability to manipulate datasets.

Most visualizations are backed by data that is stored in arrays. These datasets, for example, consist of simple arrays, associative arrays, or even nested maps. In many cases, the data that we want to visualize...

Formatting numbers and dates


In visualizations, we will often be confronted with labeling our data properly and make the values easy to read. Floating point divisions often return ugly and long decimal numbers that do not have to be displayed with the very last position after the decimal point. When displaying time series data, we often want to customize the label captions such that they just display, for example, the current day, month, or year. You will first learn about number formats in D3.js and afterwards take a look at date and time formatting.

Specifying a number format

To create a custom number formatting function—that formats a number to a string—we use the d3.format(specifier) helper function. As an argument, we will specify the format of the output. This will return a custom function that takes the number as an argument and returns the formatted output.

The specifier has the following form:

[[fill]align][sign][symbol][0][width][,][.precision][type]

Normally, we would first define...

Working with scales


In data visualization, we will always have to deal with mapping our dataset to a specific range of pixels. Let me give you an example. We have a dataset [0, 2, 4, 6, 8, 10] that we want to display for mobile and desktop devices. On mobiles, we want to display the dataset with a total width of 480px and on desktops with a width of 1024px. In order to draw the dataset on these pixel ranges, we need to map it to these ranges. We can see this example in the following figure:

Mapping the dataset to a pixel range

D3.js provides a very useful tool to map a dataset to a certain range of pixels: d3.scale. In D3.js, we call the mapping function's scale, the slice of the dataset that we want to map the domain to, and the pixel range on which we want to map the dataset range. In the following figure, we can see a dataset, where only the positive values are mapped to the width of the axis:

Visualizing a scale

In order to represent different data types with scales, we distinguish between...

All about axes


Until now, we just scaled our dataset without drawing a single shape on the screen. For the next step, I want to introduce d3.svg.axis(), a built-in function to draw axes and labels. This function makes it very easy and comfortable to add an axis to a chart, as shown in the following code:

var axis = d3.svg.axis();

First, we create a new axis object with d3.svg.axis(), which we can then configure by calling different methods on it. I will now discuss the most important of these methods:

  • axis.scale([scale]): This adds scaling to an axis as follows:

    var scale = d3.scale.linear()
      .domain([0, 10])
      .range([0, 100]);
    
    var axis = d3.svg.axis()
      .scale(scale);
  • axis.orient([orientation]): This specifies an orientation of the ticks values relative to the axis. The orientation can be top, bottom, left, or right:

    var axis = d3.svg.axis()
      .orient('bottom');
  • axis.ticks([arguments…]): This specifies the tick number or interval relative to the given scale, as shown in the following code...

Summary


In this chapter, I explained the usage of the most important statistical functions (such as d3.min() and d3.max()), and we saw them applied in the last example. They are useful to resize the axes and the chart automatically, when values in the dataset are out of the current range of the axis.

We also discussed array manipulation functions, which help us to modify, structure, and preprocess the data for the visualization. In the first section, we also saw an example of d3.nest() that groups elements as an associative array by their keys in multihierarchical levels.

You learned how to format number values and convert them to strings with d3.format(). The specifier defines how the formatter parses the values and formats them in different data types, currencies, and alignments.

Then, we introduced scales as a way to map an input domain to an output range. We saw linear scales for quantitative and ordinal scales as well as time scales, which are basically linear scales with JavaScript date...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Data Visualization with D3 and AngularJS
Published in: Apr 2015Publisher: ISBN-13: 9781784398484
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Erik Hanchett

Erik Hanchett is a software developer, blogger, and perpetual student who has been writing code for over 10 years. He currently resides in Reno Nevada, with his wife and two kids. He blogs about software development at ProgramWithErik.com. I would like to thank my wife Susan for helping me stay motivated. My friend F.B. Woods for all his help on the English language and Dr. Bret Simmons for teaching me the value of a personal brand. I would also like to thank all my friends and family that encouraged me along the way.
Read more about Erik Hanchett

author image
Christoph Körner

Christoph Körner previously worked as a cloud solution architect for Microsoft, specializing in Azure-based big data and machine learning solutions, where he was responsible for designing end-to-end machine learning and data science platforms. He currently works for a large cloud provider on highly scalable distributed in-memory database services. Christoph has authored four books: Deep Learning in the Browser for Bleeding Edge Press, as well as Mastering Azure Machine Learning (first edition), Learning Responsive Data Visualization, and Data Visualization with D3 and AngularJS for Packt Publishing.
Read more about Christoph Körner