Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
R Data Visualization Cookbook
R Data Visualization Cookbook

R Data Visualization Cookbook: Over 80 recipes to analyze data and create stunning visualizations with R

$15.99 per month
Book Jan 2015 236 pages 1st Edition
eBook
$27.99 $18.99
Print
$46.99
Subscription
$15.99 Monthly
eBook
$27.99 $18.99
Print
$46.99
Subscription
$15.99 Monthly

What do you get with a Packt Subscription?

Free for first 7 days. $15.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details


Publication date : Jan 29, 2015
Length 236 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781783989508
Category :
Languages :
Table of content icon View table of contents Preview book icon Preview Book

R Data Visualization Cookbook

Chapter 1. A Simple Guide to R

In this chapter, we will cover the following recipes:

  • Installing packages and getting help in R

  • Data types in R

  • Special values in R

  • Matrices in R

  • Editing a matrix in R

  • Data frames in R

  • Editing a data frame in R

  • Importing data in R

  • Exporting data in R

  • Writing a function in R

  • Writing if else statements in R

  • Basic loops in R

  • Nested loops in R

  • The apply, lapply, sapply, and tapply functions

  • Using par to beautify a plot in R

  • Saving plots

Installing packages and getting help in R


If you are a new user and have never launched R, you must definitely start the learning process by understanding the use of install.packages(), library(), and getting help in R. R comes loaded with some basic packages, but the R community is rapidly growing and active R users are constantly developing new packages for R.

As you read through this cookbook, you will observe that we have used a lot of packages to create different visualizations. So the question now is, how do we know what packages are available in R? In order to keep myself up-to-date with all the changes that are happening in the R community, I diligently follow these blogs:

  • Rblogger

  • Rstudio blog

There are many blogs, websites, and posts that I will refer to as we go through the book. We can view a list of all the packages available in R by going to http://cran.r-project.org/, and also http://www.inside-r.org/packages provides a list as well as a short description of all the packages.

Getting ready

We can start by powering up our R studio, which is an Integrated Development Environment (IDE) for R. If you have not downloaded Rstudio, then I would highly recommend going to http://www.rstudio.com/ and downloading it.

How to do it…

To install a package in R, we will use the install.packages() function. Once we install a package, we will have to load the package in our active R session; if not, we will get an error. The library() function allows us to load the package in R.

How it works…

The install.packages() function comes with some additional arguments but, for the purpose of this book, we will only use the first argument, that is, the name of the package. We can also load multiple packages by using install.packages(c("plotrix", "RColorBrewer")). The name of the package is the only argument we will use in the library() function. Note that you can only load one package at a time with the library() function unlike the install.packages() function.

There's more…

It is hard to remember all the functions and their arguments in R, unless we use them all the time, and we are bound to get errors and warning messages. The best way to learn R is to use the active R community and the help manual available in R.

To understand any function in R or to learn about the various arguments, we can type ?<name of the function>. For example, I can learn about all the arguments related to the plot() function by simply typing ?plot or ?plot() in the R console window. You will now view the help page on the right side of the screen. We can also learn more about the behavior of the function using some of the examples at the bottom of the help page.

If we are still unable to understand the function or its use and implementation, we could go to Google and type the question or use the Stack Overflow website. I am always able to resolve my errors by searching on the Internet. Remember, every problem has a solution, and the possibilities with R are endless.

See also

Data types in R


Everything in R is in the form of objects. Objects can be manipulated in R. Some of the common objects in R are numeric vectors, character vectors, complex vectors, logical vectors, and integer vectors.

How to do it…

In order to generate a numeric vector in R, we can use the C() notation to specify it as follows:

x = c(1:5) # Numeric Vector

To generate a character vector, we can specify the same within quotes (" ") as follows:

y ="I am Home" # Character Vector

To generate a complex vector, we can use the i notation as follows:

c = c(1+3i) #complex vector

A list is a combination of a character and a numeric vector and can be specified using the list() notation:

z = list(c(1:5),"I am Home") # List

Special values in R


R comes with some special values. Some of the special values in R are NA, Inf, -Inf, and NaN.

How to do it…

The missing values are represented in R by NA. When we download data, it may have missing data and this is represented in R by NA:

z = c( 1,2,3, NA,5,NA) # NA in R is missing Data

To detect missing values, we can use the install.packages() function or is.na(), as shown:

complete.cases(z) # function to detect NA
is.na(z) # function to detect NA

To remove the NA values from our data, we can type the following in our active R session console window:

clean <- complete.cases(z)
z[clean] # used to remove NA from data

Please note the use of square brackets ([ ]) instead of parentheses.

In R, not a number is abbreviated as NaN. The following lines will generate NaN values:

##NaN
0/0
m <- c(2/3,3/3,0/0)
m

The is.finite, is.infinite, or is.nan functions will generate logical values (TRUE or FALSE).

is.finite(m)
is.infinite(m)
is.nan(m)

The following line will generate inf as a special value in R:

## infinite
k = 1/0

Tip

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

How it works…

complete.cases(z) is a logical vector indicating complete cases that have no missing value (NA). On the other hand, is.na(z) indicates which elements are missing. In both cases, the argument is our data, a vector, or a matrix.

R also allows its users to check if any element in a matrix or a vector is NA by using the anyNA() function. We can coerce or assign NA to any element of a vector using the square brackets ([ ]). The [3] input instructs R to assign NA to the third element of the dk vector.

Matrices in R


In this recipe, we will dive into R's capability with regard to matrices.

How to do it…

A vector in R is defined using the c() notation as follows:

vec = c(1:10)

A vector is a one-dimensional array. A matrix is a multidimensional array. We can define a matrix in R using the matrix() function. Alternatively, we can also coerce a set of values to be a matrix using the as.matrix() function:

mat = matrix(c(1,2,3,4,5,6,7,8,9,10),nrow = 2, ncol = 5)
mat

To generate a transpose of a matrix, we can use the t() function:

t(mat) # transpose a matrix

In R, we can also generate an identity matrix using the diag() function:

d = diag(3) # generate an identity matrix

We can nest the rep () function within matrix() to generate a matrix with all zeroes as follows:

zro = matrix(rep(0,6),ncol = 2,nrow = 3 )# generate a matrix of Zeros
zro

How it works…

We can define our data in the matrix () function by specifying our data as its first argument. The nrow and ncol arguments are used to specify the number of rows and column in a matrix. The matrix function in R comes with other useful arguments and can be studied by typing ?matrix in the R command window.

The rep() function nested in the matrix() function is used to repeat a particular value or character string a certain number of times.

The diag() function can be used to generate an identity matrix as well as extract the diagonal elements of a matrix. More uses of the diag() function can be explored by typing ?diag in the R console window.

The code file provides a lot more functions that can used along with matrices—for example, functions related to finding a determinant or inverse of a matrix and matrix multiplication.

Editing a matrix in R


R allows us to edit (add, delete, or replace) elements of a matrix using the square bracket notation, as depicted in the following lines of code:

mat = matrix(c(1:10),nrow = 2, ncol = 5)
mat
mat[2,3]

How to do it…

In order to extract any element of a matrix, we can specify the position of that element in R using square brackets. For example, mat[2,3] will extract the element under the second row and the third column. The first numeric value corresponds to the row and the second numeric value corresponds to a column [row, column].

Similarly, to replace an element, we can type the following lines in R:

mat[2,3] = 16

To select all the elements of the second row, we can use mat[2, ]. If we do not specify any numeric value for a column, R will automatically assume all columns.

Data frames in R


One of the useful and widely used functions in R is the data.frame() function. Data frame, according to the R manual, is a matrix structure whose columns can be of differing types, such as numeric, logical, factor, or character.

How to do it…

A data frame in R is a collection of variables. A simple way to construct a data frame is using the data.frame() function in R:

data = data.frame(x = c(1:4), y = c("tom","jerry","luke","brian"))
data

Many times, we will encounter plotting functions that require data to be in a data frame. In order to coerce our data into a data frame, we can use the data.frame() function. In the following example, we create a matrix and convert it into a data frame:

mat = matrix(c(1:10), nrow = 2, ncol = 5)
data.frame(mat)

The data.frame() function comes with various arguments and can be explored by typing ?data.frame in the R console window. The code file under the title Data Frames – 2 provides additional functions that can help in understanding the underlying structure of our data. We can always get additional help by using the R documentation.

Editing a data frame in R


Once we have generated a data and converted it into a data frame, we can edit any row or column of a data frame.

How to do it...

We can add or extract any column of a data frame using the dollar ($) symbol, as depicted in the following code:

data = data.frame(x = c(1:4), y = c("tom","jerry","luke","brian"))
data$age = c(2,2,3,5)
data

In the preceding example, we have added a new column called age using the $ operator. Alternatively, we can also add columns and rows using the rbind() and cbind() functions in R as follows:

age = c(2,2,3,5)
data = cbind(data, age)

The cbind and rbind functions can also be used to add columns or rows to an existing matrix.

To remove a column or a row from a matrix or data frame, we can simply use the negative sign before the column or row to be deleted, as follows:

data = data[,-2]

The data[,-2] line will delete the second column from our data.

To re-order the columns of a data frame, we can type the following lines in the R command window:

data = data.frame(x = c(1:4), y = c("tom","jerry","luke","brian"))
data = data[c(2,1)]# will reorder the columns
data

To view the column names of a data frame, we can use the names() function:

names(data)

To rename our column names, we can use the colnames() function:

colnames(data) = c("Number","Names")

Importing data in R


Data comes in various formats. Most of the data available online can be downloaded in the form of text documents (.txt extension) or as comma-separated values (.csv). We also encounter data in the tab-delimited format, XLS, HTML, JSON, XML, and so on. If you are interested in working with data, either in JSON or XML, refer to the recipe Constructing a bar plot using XML in R in Chapter 10, Creating Applications in R.

How to do it...

In order to import a CSV file in R, we can use the read.csv() function:

test = read.csv("raw.csv", sep = ",", header = TRUE)

Alternatively, read.table() function allows us to import data with different separators and formats. Following are some of the methods used to import data in R:

How it works…

The first argument in the read.csv() function is the filename, followed by the separator used in the file. The header = TRUE argument is used to instruct R that the file contains headers. Please note that R will search for this file in its current directory. We have to specify the directory containing the file using the setwd() function. Alternatively, we can navigate and set our working directory by navigating to Sessions | Set working directory | Choose directory.

The first argument in the read.table() function is the filename that contains the data, the second argument states that the data contains the header, and the third argument is related to the separator. If our data consists of a semi colon (;), a tab delimited, or the @ symbol as a separator, we can specify this under the sep ="" argument. Note that, to specify a separator as a tab delimited, users would have to substitute sep = "," with sep ="\t" in the read.table() function.

One of the other useful arguments is the row.names argument. If we omit row.names, R will use the column serial numbers as row.names. We can assign row.names for our data by specifying it as row.names = c("Name").

Exporting data in R


Once we have processed our data, we need to save it to an external device or send it to our colleagues. It is possible to export data in R in many different formats.

How to do it…

To export data from R, we can use the write.table() function. Please note that R will export the data to our current directory or the folder we have assigned using the setwd() function:

write.table(data, "mydata.csv", sep=",")

How it works…

The first argument in the write.table() function is the data in R that we would like to export. The second argument is the name of the file. We can export data in the .xls or .txt format, simply by replacing the mydata.csv file extension with mydata.txt or mydata.xls in the write.table() function.

Writing a function in R


Most of the tasks in R are performed using functions. A function in R has the same utility as functions in Arithmetic.

Getting ready

In order to write a simple function in R, we must first open a new R script by navigating to File | New file.

How to do it…

We write a very simple function that accepts two values and adds them together. Copy and paste the code in the new blank R script:

add = function (x,y){
  x+y
}

How it works…

A function in R should be defined by function(). Once we define our function, we need to save it as a .r file. Note that the name of the file should be the same as the function; hence we save our function with name add.r.

In order to use the add() function in the R command window, we need to source the file by using the source() function as follows:

source('<your path>/add.R')

Now, we can type add(2,15) in the R command window. You get 17 printed as an output.

The function itself takes two arguments in our recipe but, in reality, it can take many arguments. Anything defined inside curly braces gets executed when we call add(). In our case, we request the user to input two variables, and the output is a simple sum.

See also

  • Functions can be helpful in performing repetitive tasks such as generating plots or perform complicated calculations. Felix Schönbrodt has implemented visually weighted watercolor plots in R using a function on his blog at http://www.nicebread.de/visually-weighted-watercolor-plots-new-variants-please-vote/.

  • We can generate similar plots simply by copying the function created by Felix in our R session and executing it. The plotting function created by Felix also provides users with different ways in which the R function's ability could be leveraged to perform repetitive tasks.

Writing if else statements in R


We often use if statements in MS Excel, but we can also write a small code to perform simple tasks in R.

How to do it…

The logic for if else statements is very simple and is as follows:

if(x>3){
  print("greater value")
}else {
  print("lesser value")
}

We can copy and paste the preceding statement in the R console or write a function that makes use of the if else logic.

How it works…

The logic behind if else statements is very simple. The following lines clearly state the logic:

if(condition){
#perform some action
}else {
  #perform some other action
}

The preceding code will check whether x is greater than or less than 3, and simply print it. In order to get the value, we type the following in the R command window:

x = 2

Basic loops in R


If we want to perform an action repeatedly in R, we can utilize the loop functionality.

How to do it…

The following lines of code multiply each element of x and y and store them as a vector z:

x = c(1:10)
y = c(1:10)
for(i in 1:10){
z[i] = x[i]*y[i]
}

How it works…

In the preceding code, a calculation is executed 10 times. R performs any calculation specified within {}. We are instructing R to multiply each element of x (using the x[i] notation) by each element in y and store the result in z.

Nested loops in R


We can nest loops, as well as if statements, to perform some more complicated tasks. In this recipe, we will first define a square matrix and then write a nested for loop to print only those values where I = J, namely, the values in the matrix placed in (1,1), (2,2), and so on.

How to do it…

We first define a matrix in R using the following matrix() function:

mat= matrix(1:25, 5,5)

Now, we use the following code to output only those elements where I = J:

for (i in 1:5){
  for (j in 1:5){
    if (i ==j){
      print(mat[i,j])
    }
   }
}

The if statement is nested inside two for loop statements. As we have a matrix, we have to use two for loops instead of just one. The output of the matrix would be values such as 1, 7, 13, and 19.

The apply, lapply, sapply, and tapply functions


R has some very handy functions such as apply, sapply, tapply, and mapply, that can be used to reduce the task of writing complicated statements. Also, using them makes our code look cleaner. The apply() function is similar to writing a loop statement.

The lapply() function is very similar to the apply() function but can be used on lists; this will return a list. The sapply() function is very similar to lapply() but returns a vector and not a list.

How to do it…

The apply() function can be used as follows:

mat= matrix(1:25, 5,5)
apply(mat,1,sd)

The lapply() function can be used in the following way:

j = list(x = 1:4, b = rnorm(100,1,2))
lapply(j,mean)

The tapply() function is useful when we have broken a vector into factors, groups, or categories:

tapply(mtcars$mpg,mtcars$gear,mean)

How it works…

The first argument in the apply() function is the data. The second argument takes two values: 1 and 2; if we state 1, R will perform a row-wise computation; if we mention 2, R will perform a column-wise computation. The third argument is the function. We would like to calculate the standard deviation of each row in R; hence we use the sd function as the third argument. Note that we can define our own function and replace it with the sd function.

With regard to the lapply() function, we have defined J as a list and would like to calculate the mean. The first argument in the lapply() function is the data and the second argument is the function used to process the data.

The first argument in the tapply() function is the data; in our case it is mpg. The second argument is the factor or the grouping; in this case it would be gears. The last argument is the function used to process the data. We would like to calculate the mean of mpg for each unique gear (3, 4, and 5 gears) in the mtcars data.

Using par to beautify a plot in R


One quick and easy way to edit a plot is by generating the plot in R and then using Inkspace or any other software to edit it. We can save some valuable time if we know some basic edits that can be applied on a plot by setting them in a par() function. All the available options to edit a plot can be studied in detail by typing ?par in the command window.

How to do it…

In the following code, I have highlighted some commonly used parameters:

x=c(1:10)
y=c(1:10)
par(bg = "#646989", las = 1, col.lab = "black", col.axis = "white",bty = "n",cex.axis = 0.9,cex.lab= 1.5)
plot(x,y, pch = 20, xlab = "fake x data", ylab = "fake y data")

How it works…

Under the par() function, we have set the background color using the bg = argument. The las = argument changes the orientation of the labels. The col.lab and col.axis arguments are used to specify the color of the labels as well as the axis. The cex argument is used to specify the size of the labels and axis. The bty argument is used to specify the box style in R.

Saving plots


We can save a plot in various formats, such as .jpeg, .svg, .pdf, or .png. I prefer saving a plot as a .png file, as it is easier to edit a plot with Inkspace if saved in the PNG format.

How to do it…

To save a plot in the .png format, we can use the png() function as follows:

png("TEST.png", width = 300, height = 600)
plot(x,y, xlab = "x axis", ylab = "y axis", cex.lab = 3,col.lab = "red", main = "some data", cex.main=1.5, col.main = "red")
dev.off()

How it works…

We have used the png() function to save the plot as a PNG. To save a plot as a PDF, SVG, or JPEG, we can use the pdf(), svg(), or jpeg() functions, respectively.

The first argument in the png() function is the name of the file with the extension, followed by the width and height of the plot. We can now use the plot() function to generate a plot; any subsequent plots will also be saved with a .png extension, unless the dev.off() function is passed. The dev.off() function instructs R that we do not need to save the plots.

Left arrow icon Right arrow icon

Key benefits

What you will learn

Generate various plots in R using the basic R plotting techniques Utilize R packages to add context and meaning to your data Design interactive visualizations and integrate them on your website or blog Communicate using visualization techniques, optimal for the underlying data being used as input Create presentations and learn the basics of creating apps in R for your audience Introduce users to basic R functions and data manipulation techniques while creating meaningful visualizations Add elements, text, animation, and colors to your plot to make sense of data

What do you get with a Packt Subscription?

Free for first 7 days. $15.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details


Publication date : Jan 29, 2015
Length 236 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781783989508
Category :
Languages :

Table of Contents

17 Chapters
R Data Visualization Cookbook Chevron down icon Chevron up icon
Credits Chevron down icon Chevron up icon
About the Author Chevron down icon Chevron up icon
About the Reviewers Chevron down icon Chevron up icon
www.PacktPub.com Chevron down icon Chevron up icon
Preface Chevron down icon Chevron up icon
A Simple Guide to R Chevron down icon Chevron up icon
Basic and Interactive Plots Chevron down icon Chevron up icon
Heat Maps and Dendrograms Chevron down icon Chevron up icon
Maps Chevron down icon Chevron up icon
The Pie Chart and Its Alternatives Chevron down icon Chevron up icon
Adding the Third Dimension Chevron down icon Chevron up icon
Data in Higher Dimensions Chevron down icon Chevron up icon
Visualizing Continuous Data Chevron down icon Chevron up icon
Visualizing Text and XKCD-style Plots Chevron down icon Chevron up icon
Creating Applications in R Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Filter icon Filter
Top Reviews
Rating distribution
Empty star icon Empty star icon Empty star icon Empty star icon Empty star icon 0
(0 Ratings)
5 star 0%
4 star 0%
3 star 0%
2 star 0%
1 star 0%

Filter reviews by


No reviews found
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.