Graphical Capabilities of R

Exclusive offer: get 50% off this eBook here
Statistical Analysis with R

Statistical Analysis with R — Save 50%

Take control of your data and produce superior statistical analysis with R.

$26.99    $13.50
by John M. Quick | October 2010 | Open Source

The R Project for Statistical Computing (or just R for short) is a powerful data analysis tool. It is both a programming language and a computational and graphical environment.

R is free, open source software made available under the GNU General Public License. It runs on Mac, Windows, and Unix operating systems.

The official R website is available at the following site:

http://www.r-project.org

In the previous article by John M. Quick, author of the book Statistical Analysis with R, we learned how to create charts, graphs, and plots in R. We also took a look at customizing graphics in R.

In this article, you will be able to:

  • Create different charts, graphs, and plots in R
  • Save and export your graphics for use outside of R

Statistical Analysis with R

Statistical Analysis with R

Take control of your data and produce superior statistical analysis with R.

  • An easy introduction for people who are new to R, with plenty of strong examples for you to work through
  • This book will take you on a journey to learn R as the strategist for an ancient Chinese kingdom!
  • A step by step guide to understand R, its benefits, and how to use it to maximize the impact of your data analysis
  • A practical guide to conduct and communicate your data analysis with R in the most effective manner

 

        Read more about this book      

(For more resources on R, see here.)

Time for action — creating a line chart

The ever popular line chart, or line graph, depicts relationships as continuous series of connected data points. Line charts are particularly useful for visualizing specific values and trends over time. Just as a line chart is an extension of a scatterplot in the non-digital realm, a line chart is created using an extended form of the plot(...) function in R. Let us explore how to extend the plot(...) function to create line charts in R:

  1. Use the type argument within the plot(...) function to create a line chart that depicts a single relationship between two variables:

    > #create a line chart that depicts the durations of past fire attacks
    > #get the data to be used in the chart
    > lineFireDurationDataX <- c(1:30)
    > lineFireDurationDataY <- subsetFire$DurationInDays
    > #customize the chart
    > lineFireDurationMain <- "Duration of Past Fire Attacks"
    > lineFireDurationLabX <- "Battle Number"
    > lineFireDurationLabY <- "Duration in Days"
    > #use the type argument to connect the data points with a line
    > lineFireDurationType <- "o"
    > #use plot(...) to create and display the line chart
    > plot(x = lineFireDurationDataX, y = lineFireDurationDataY,
    main = lineFireDurationMain, xlab = lineFireDurationLabX,
    ylab = lineFireDurationLabY, type = lineFireDurationType)

  2. Your chart will be displayed in the graphic window, as follows:

What just happened?

We expanded our use of the plot(...) function to generate a line chart and encountered a new data notation in the process. Let us review these features.

type

In the plot(...) function, the type argument determines what kind of line, if any, should be used to connect a chart's data points. The type argument receives one of several character values, all of which are listed as follows:

  • p: only points are plotted; this is the default value when type is undefined
  • l: only lines are drawn, without any points
  • o: both lines and points are drawn, with the lines overlapping the points
  • b: both lines and points are drawn, with the lines broken where they intersect with points
  • c: only lines are drawn, but they are broken where points would occur
  • s: only the lines are drawn in step formation; the initial step begins at zero
  • S: (uppercase) only the lines are drawn in step formation; the final step tails off at the last point
  • h: vertical lines are drawn to represent each point
  • n: no points nor lines are drawn

Our chart, which represented the duration of past fire attacks, featured a line that overlapped the plotted points. First, we defined our desired line type in an R variable:

> lineFireDurationType <- "o"

Then the type argument was placed within our plot(...) function to generate the line chart:

> plot(lineFireDurationDataX, lineFireDurationDataY,
main = lineFireDurationMain, xlab = lineFireDurationLabX,
ylab = lineFireDurationLabY,
type = lineFireDurationType)

Number-colon-number notation

You may have noticed that we specified a vector for the x-axis data in our plot(...) function.

> lineFireDurationDataX <- c(1:30)

This vector used number-colon-number notation. Essentially, this notation has the effect of enumerating a range of values that lie between the number that precedes the colon and the number that follows it. To do so, it adds one to the beginning value until it reaches a final value that is equal to or less than the number that comes after the colon. For example, the code > 14:21 would yield eight whole numbers, beginning with 14 and ending with 21, as follows:

[1] 14 15 16 17 18 19 20 21

Furthermore, the code > 14.2:21 would yield seven values, beginning with 14.2 and ending with 20.2, as follows:

[1] 14.2 15.2 16.2 17.2 18.2 19.2 20.2

Number-colon-number notation is a useful way to enumerate a series of values without having to type each one individually. It can be used in any circumstance where a series of values is acceptable input into an R function.

Number-colon-number notation can also enumerate values from high to low. For instance, 21:14 would yield a list of values beginning with 21 and ending with 14.

Since we do not have exact dates or other identifying information for our 30 past battles, we simply enumerated the numbers 1 through 30 on the x-axis. This had the effect of assigning a generic identification number to each of our past battles, which in turn allowed us to plot the duration of each battle on the y axis.

Pop quiz

  1. Which of the following is the type argument capable of?
    1. Drawing a line to connect or replace the points on a scatterplot.
    2. Drawing vertical or step lines.
    3. Drawing no points or lines.
    4. All of the above.
  2. What would the following line of code yield in the R console?

    > 1:50

    1. A sequence of 50 whole numbers, in order from 1 to 50.
    2. A sequence of 50 whole numbers, in order from 50 to 1.
    3. A sequence of 50 random numbers, in order from 1 to 50.
    4. A sequence of 50 random numbers, in order from 50 to 1.

Time for action — creating a box plot

A useful way to convey a collection of summary statistics in a dataset is through the use of a box plot. This type of graph depicts a dataset's minimum and maximum, as well as its lower, median, and upper quartiles in a single diagram. Let us look at how box plots are created in R:

  1. Use the boxplot(...) function to create a box plot.

    > #create a box plot that depicts the number of soldiers required to launch a fire attack
    > #get the data to be used in the plot
    > boxplotFireShuSoldiersData <- subsetFire$ShuSoldiers
    > #customize the plot
    > boxPlotFireShuSoldiersLabelMain <- "Number of Soldiers Required to Launch a Fire Attack"
    > boxPlotFireShuSoldiersLabelX <- "Fire Attack Method"
    > boxPlotFireShuSoldiersLabelY <- "Number of Soldiers"
    > #use boxplot(...) to create and display the box plot
    > boxplot(x = boxplotFireShuSoldiersData,
    main = boxPlotFireShuSoldiersLabelMain,
    xlab = boxPlotFireShuSoldiersLabelX,
    ylab = boxPlotFireShuSoldiersLabelY)

  2. Your plot will be displayed in the graphic window, as shown in the following:

  3. Use the boxplot(...) function to create a box plot that compares multiple datasets.

    > #create a box plot that compares the number of soldiers required across the battle methods
    > #get the data formula to be used in the plot
    > boxplotAllMethodsShuSoldiersData <- battleHistory$ShuSoldiers ~ battleHistory$Method
    > #customize the plot
    > boxPlotAllMethodsShuSoldiersLabelMain <- "Number of Soldiers Required by Battle Method"
    > boxPlotAllMethodsShuSoldiersLabelX <- "Battle Method"
    > boxPlotAllMethodsShuSoldiersLabelY <- "Number of Soldiers"
    > #use boxplot(...) to create and display the box plot
    > boxplot(formula = boxplotAllMethodsShuSoldiersData,
    main = boxPlotAllMethodsShuSoldiersLabelMain,
    xlab = boxPlotAllMethodsShuSoldiersLabelX,
    ylab = boxPlotAllMethodsShuSoldiersLabelY)

  4. Your plot will be displayed in the graphic window, as shown in the following:

What just happened?

We just created two box plots using R's boxplot(...) function, one with a single box and one with multiple boxes.

boxplot(...)

We started by generating a single box plot that was composed of a dataset, main title, and x and y labels. The basic format for a single box plot is as follows:

boxplot(x = dataset)

The x argument contains the data to be plotted. Technically, only x is required to create a box plot, although you will often include additional arguments. Our boxplot(...) function used the main, xlab, and ylab arguments to display text on the plot, as shown:

> boxplot(x = boxplotFireShuSoldiersData,
main = boxPlotFireShuSoldiersLabelMain,
xlab = boxPlotFireShuSoldiersLabelX,
ylab = boxPlotFireShuSoldiersLabelY)

Next, we created a multiple box plot that compared the number of Shu soldiers deployed by each battle method. The main, xlab, and ylab arguments remained from our single box plot, however our multiple box plot used the formula argument instead of x. Here, a formula allows us to break a dataset down into separate groups, thus yielding multiple boxes.

The basic format for a multiple box plot is as follows:

boxplot(formula = dataset ~ group)

In our case, we took our entire Shu soldier dataset (battleHistory$ShuSoldiers) and separated it by battle method (battleHistory$Method):

> boxplotAllMethodsShuSoldiersData <- battleHistory$ShuSoldiers ~ battleHistory$Method

Once incorporated into the boxplot(...) function, this formula resulted in a plot that contained four distinct boxes—ambush, fire, head to head, and surround:

> boxplot(formula = boxplotAllMethodsShuSoldiersData,
main = boxPlotAllMethodsShuSoldiersLabelMain,
xlab = boxPlotAllMethodsShuSoldiersLabelX,
ylab = boxPlotAllMethodsShuSoldiersLabelY)

Pop quiz

  1. Which of the following best describes the result of the following code?

    > boxplot(x = a)

    1. A single box plot of the a dataset.
    2. A single box plot of the x dataset.
    3. A multiple box plot of the a dataset that is grouped by x.
    4. A multiple box plot of the x dataset that is grouped by a.
  2. Which of the following best describes the result of the following code?

    > boxplot(formula = a ~ b)

    1. A single box plot of the a dataset.
    2. A single box plot of the b dataset.
    3. A multiple box plot of the a dataset that is grouped by b.
    4. A multiple box plot of the b dataset that is grouped by a.
Statistical Analysis with R Take control of your data and produce superior statistical analysis with R.
Published: October 2010
eBook Price: $26.99
Book Price: $44.99
See more
Select your format and quantity:
        Read more about this book      

(For more resources on R, see here.)

Time for action — creating a histogram

A histogram displays the frequency with which certain values occur in a dataset. Visually, a histogram looks similar to a bar chart, but it conveys different information. Histograms help us to get an idea of how varied and distributed our data are. Let us begin the histogram making process in R:

  1. Use the hist(...) function to create a histogram:

    > #create a histogram that depicts the frequency distribution of past fire attack durations
    > #get the histogram data
    > histFireDurationData <- subsetFire$DurationInDays
    > #customize the histogram
    > histFireDurationDataMain <- "Duration of Past Fire Attacks"
    > histFireDurationLabX <- "Duration in Days"
    > histFireDurationLimY <- c(0, 10)
    > histFireDurationRainbowColor <- rainbow(max(histFireDurationData))
    > #use hist(...) to create and display the histogram
    > hist(x = histFireDurationData,
    main = histFireDurationDataMain, xlab = histFireDurationLabX,
    ylim = histFireDurationLimY,
    col = histFireDurationRainbowColor)

  2. Your histogram will be displayed in the graphic window, as shown in the following:

What just happened?

We used the hist(...) function to generate a histogram that depicted the frequency distribution of our fire attack duration data.

hist(...)

In its simplest form, the hist(...) function is very similar to boxplot(...). At a minimum, it requires only that the data for the chart's columns be defined. A simple function looks like the following:

hist(x = dataset)

As is true with our other graphics, the hist(...) function also receives graphic customization arguments. We rescaled our y-axis with ylim, colored our bars with col, and added text to our histogram with main and xlab. Also note that we used the max(data) function within the rainbow(...) component of our col argument to ensure that our histogram would have enough colors to represent each unique value in our dataset:

hist(x = histFireDurationData, main = histFireDurationDataMain,
xlab = histFireDurationLabX, ylim = histFireDurationLimY,
col = histFireDurationRainbowColor)

Pop quiz

  1. Which of the following information are we not capable of deriving from a histogram?
    1. The most and least frequently occurring values in the dataset.
    2. The total number of data points in the dataset.
    3. The minimum and maximum values in the dataset.
    4. The exact value of each data point in the dataset.

Time for action — creating a pie chart

Pie charts are a fast and easy way to visualize a single relationship within a dataset. Let us look at how to create a pie chart in R:

  1. Use the pie(...) function to create a pie chart:

    > #create a pie chart that depicts the gold cost of the fire attack in relation to the total funds allotted to the Shu army
    > #get the data to be used in the chart
    > #what is the cost of the proposed fire attack?
    > functionGoldCost(2500, 225, 7)
    [1] 6791.667
    > #we already know that 1,000,000 gold has been allotted to the Shu army
    > #therefore our remaining funds after the fire attack would
    be 993,208
    > #create a vector to hold the values for the chart's slices
    > pieFireGoldCostSlices <- c(6792, 993208)
    > #use the labels argument to specify the text associated with each of the chart's slices
    > pieFireGoldCostLabels <- c("mission cost", "remaining funds")
    > #customize the chart
    > pieFireGoldCostMain <- "Gold Cost of Fire Attack"
    > pieFireGoldCostSpecificColors <- c("green", "blue")
    > #use pie(...) to create and display the pie chart
    > pie(x = pieFireGoldCostSlices,
    labels = pieFireGoldCostLabels, main = pieFireGoldCostMain,
    col = pieFireGoldCostSpecificColors)

  2. Your chart will be displayed in the graphic window, similar to the following:

  3. Add a legend to the chart, using the following code:

    > #use the legend(...) function to add a legend to the chart
    > legend(x = "bottom", legend = pieFireGoldCostLabels,
    fill = pieFireGoldCostSpecificColors)

  4. Your legend will be added to the existing chart, which will look like the following:

What just happened

We created a pie chart using R's pie(...) function and then appended it with a legend. Let us review how pie charts are generated in R.

pie(...)

The primary arguments used in the pie(...) function are x and labels:

  • x: the numerical values for the pie's slices. These must be nonnegative and input in vector form.
  • labels: the text labels for the pie's slices. These must be input in vector form.

Consequently, the pie chart function takes on the following basic form:

pie(x = sliceData, labels = sliceText)

Where sliceData and sliceText are in vector form.

To create our pie chart, we first calculated the cost information that we wished to display and stored it in a vector variable, like so:

> pieFireGoldCostSlices <- c(6792, 993208)

Next, we created a vector to hold the text labels for our pie's slices:

> pieFireGoldCostLabels <- c("mission cost", "remaining funds")

Then we customized our chart with a main title and specific colors, before executing our complete pie(...) function:

> pie(x = pieFireGoldCostSlices, labels = pieFireGoldCostLabels,
main = pieFireGoldCostMain, col = pieFireGoldCostSpecificColors

Lastly, we added a legend to our chart to further clarify its components:

> legend(x = "bottom", legend = pieFireGoldCostLabels,
fill = pieFireGoldCostSpecificColors)

Pop quiz

  1. In the pie(...) function, what do the x and labels arguments represent?
    1. labels represents the slice's numerical values, whereas x represents the slice's text labels.
    2. x represents the slice's numerical values, whereas labels represents the slice's text labels.
    3. labels represents the slice's text values, whereas x represents the slice's numerical labels.
    4. x represents the slice's text values, whereas labels represents the slice's numerical labels.

Time for action – exporting graphics

Now that we have created all of these informative graphics, it would be nice to be able to use them for presentations, reports, desktop wallpapers, or a variety of other purposes. Fortunately, R is capable of turning its graphics into digital images that can be used in other applications. Let us look at how to export our graphics for use outside of R:

  1. Use one of R's several export functions to convert a graphic into a digital image, it can be done as follows:

    > #use an export function to save a graphic as a digital image
    > #prepare R to export your graphic in one of the following
    formats: pdf, png, jpg, tiff, or bmp
    > #note that your image will be saved into your R working
    directory by default if only a filename is provided
    > #otherwise, your image will be saved to the full provided
    path
    > #optionally, the width and height, in pixels, of the
    resulting image can be specified
    > #export as pdf
    > pdf("myGraphic.pdf", width = 500, height = 500)
    > #OR
    > #export as png
    > png("myGraphic.png", width = 500, height = 500)
    > #OR
    > #export as jpg
    > jpeg("myGraphic.jpg", width = 500, height = 500)
    > #OR
    > #export as tiff
    > tiff("myGraphic.tiff", width = 500, height = 500)
    > #OR
    > #export as bmp
    > bmp("myGraphic.bmp", width = 500, height = 500)

  2. Create the graphic, as follows:

    > #create the graphic in R
    > #note that your graphic may NOT be displayed in the graphic
    window during this process
    > #we will use our original fire cost pie chart as an example
    > #use pie(...) to create the pie chart
    > pie(x = pieFireGoldCostSlices,
    labels = pieFireGoldCostLabels, main = pieFireGoldCostMain,
    col = pieFireGoldCostSpecificColors)
    > #use the legend(...) function to add a legend to the chart
    > legend(x = "bottom", legend = pieFireGoldCostLabels,
    fill = pieFireGoldCostSpecificColors)

  3. Use dev.off() to close the current device and export your graphic as a digital image:

    > #use dev.off() to close the current device and export the
    graphic as a digital image
    > dev.off()

  4. Your graphic will be exported. Verify that your digital image has been created.

What just happened?

We just completed the process of exporting an R graphic as a digital image file. Let us detail the three major steps involved in this procedure.

  1. Prepare the graphic device
    The first step in exporting an R graphic is to prepare the graphic device, which is the entity that handles graphics in R. This step requires that a file type for our exported graphic be defined. Optionally, a width and height for the resulting image can also be specified. These can be accomplished through the use of one of several similar functions. These are:
    1. pdf(filename, width, height)
    2. png(filename, width, height)
    3. jpeg(filename, width, height)
    4. tiff(filename, width, height)
    5. bmp(filename, width, height)

    Each of these functions prepares the graphic device to export an image associated with its name. For example, the pdf(filename, width, height) function will export an image to PDF format. The filename argument can contain either a complete path specifying where the image is to be saved or just a filename and extension. If only a name and extension are included, the image will be saved to the R working directory. The width and height parameters are measured in pixels and receive a single numeric value. For instance, see the following:

    > pdf("/Users/johnmquick/Desktop/myGraphic.pdf", width = 500,
    height = 500) q

    This would export a 500 by 500 pixel PDF image named myGraphic.pdf to the given user's desktop. Whereas, look at the following:

    > pdf("myGraphic.pdf", width = 300, height = 200)

    This would export a 300 by 200 pixel PDF image named myGraphic.pdf to the current working directory.

  2. Create the graphic
    The second step is to create the graphic in R. This can be done using any of the techniques that we have explored in this article. The only difference between this scenario and our previous activities is that we prepared our graphic device prior to creating our graphic. Note that the graphic must be created aft er executing one of the functions provided in the previous step in order to be exported. Also, unlike our other experiences with R visuals, your graphic may not be displayed in the graphic window when its function is executed.
  3. Close the graphic device
    The third and final step is to close the graphics device via the dev.off() command. Once dev.off() is executed, the graphic will be exported and saved on your computer as a digital image. Afterwards, be sure to check the location that you specified in the first step to verify that your digital image is present and that it was exported properly.

Remembering these three simple steps will allow you to export your R graphics as digital images, thereby allowing them to be used in other applications.

Pop quiz

  1. In what order must the three steps of the graphic exportation process proceed?
    1. Create the graphic, prepare the graphic device, close the graphic device.
    2. Close the graphic device, prepare the graphic device, create the graphic.
    3. Prepare the graphic device, close the graphic device, create the graphic.
    4. Prepare the graphic device, create the graphic, close the graphic device.

Summary

In this article, you created several charts, graphs, and plots. This process entailed using R's graphical prowess to generate, customize, and export visual representations of your data. At this point, you should be able to:

  • Use R to create various charts, graphs, and plots
  • Save and export your R visuals

Further resources on this subject:

  • Organizing, Clarifying and Communicating the R Data Analyses [article]
  • Customizing Graphics and Creating a Bar Chart and Scatterplot in R [article]

  • Statistical Analysis with R Take control of your data and produce superior statistical analysis with R.
    Published: October 2010
    eBook Price: $26.99
    Book Price: $44.99
    See more
    Select your format and quantity:

    About the Author :


    John M. Quick

    He is an Educational Technology doctoral student at Arizona State University who is interested in the design, research, and use of educational innovations. Currently, his work focuses on mobile, game-based, and global learning, interactive mixed-reality systems, and innovation adoption. John's blog, which provides articles, tutorials, reviews, perspectives, and news relevant to technology and education, is available from http://www.johnmquick.com. In his spare time, John enjoys photography, nature, and travel.

    Books From Packt


    jQuery 1.4 Reference Guide
    jQuery 1.4 Reference Guide

    PHP jQuery Cookbook
    PHP jQuery Cookbook

    Moodle 2.0 First Look
    Moodle 2.0 First Look

    Drupal 7
    Drupal 7

    Learning Ext JS 3.2
    Learning Ext JS 3.2

    PostgreSQL 9 Admin Cookbook
    PostgreSQL 9 Admin Cookbook

    YUI 2.8: Learning the Library
    YUI 2.8: Learning the Library

    OpenStreetMap
    OpenStreetMap


    Code Download and Errata
    Packt Anytime, Anywhere
    Register Books
    Print Upgrades
    eBook Downloads
    Video Support
    Contact Us
    Awards Voting Nominations Previous Winners
    Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
    Resources
    Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software