# First steps with R

## Donato Teutonico

July 2013

(For more resources related to this topic, see here.)

# Obtaining and installing R

The way to obtain R is downloading it from the CRAN website (http://www.r-project.org/). The Comprehensive R Archive Network (CRAN) is a network of FTP and web servers around the world that stores identical, up-to-date versions of code and documentation for R. The CRAN is directly accessible from the R website and on such website it is also possible to find information about R, some technical manuals, the R journal, and details about the packages developed for R and stored on the CRAN repositories.

The functionalities of the R environment can then also be expanded thanks to software libraries which can be installed and recalled if needed. These libraries or packages are a collection of source code and other additional files that, when installed in R, allow the user to load them in the workspace via a call to the library() function. An example of code to load the package lattice may be found as follows:

`> library(lattice)`

An R installation contains one or more libraries of packages. Some of these packages are part of the basic installation and are loaded automatically as soon as the session is started. Other can be installed from the CRAN, the official R repository, or downloaded and installed manually.

# Interacting with the console

As soon as you will start R, you will see that a workspace is open; you can see a screenshot of the R Console window in the image below. The workspace is the environment in which you are working, where you will load your data, and create your variables.

The screen prompt > is the R prompt that waits for commands. On the starting screen, you can either type any function, command, or you can use R to perform basic calculation. R uses the usual symbols for addition (+), subtraction (-), multiplication (*), division (/), and exponentiation (^). Parentheses ( ) can be used to specify the order of operations. R also provides %% for taking the modulus and %/% for integer division. Comments in R are defined by the character #, so everything after such character up to the end of the line will be ignored by R.

R has a number of built-in functions, for example, sin(x), cos(x), tan(x), (all in radians), exp(x), log(x), and sqrt(x). Some special constants such as pi are also pre-defined. You can see an example of the use of such function in the following code:

`> exp(2.5)[1] 12.18249`

# Understanding R objects

In every computer language, variables provide a means of accessing the data stored in memory. R does not provide direct access to the computer’s memory but rather provides a number of specialized data structures called objects. These objects are referred to through symbols or variables.

## Vectors

The basic object in R is the vector; even scalars are vectors of length one. Vectors can be thought of as a series of data of the same class. There are six basic vector type (called atomic vectors): logical, integer, real, complex, string (or character), and raw. Integer and real represent numeric objects; logicals are Boolean data type with possible value TRUE or FALSE. Among such atomic vectors, the more common ones are logical, string, and numeric (integer and real).

There are several ways to create vectors. For instance the operator : (colon) is a sequence-generating operator, it creates sequences by incrementing or decrementing by one.

`> 1:10 [1]  1  2  3  4  5  6  7  8  9 10> 5:-6 [1]  5  4  3  2  1  0 -1 -2 -3 -4 -5 -6`

If the interval between the numbers is not one, you can use the seq() function. Here an example

`> seq(from=2, to=2.5, by=0.1)[1] 2.0 2.1 2.2 2.3 2.4 2.5`

One of the more important features of R is the possibility to use entire vector as arguments of functions, thus avoiding the use of cyclic loops. Most of the functions in R allow the use of vector as argument, as example the use of some of these functions is reported as follows

`> x <- c(12,10,4,6,9)> max(x)[1] 12> min(x)[1] 4> mean(x)[1] 8.2`

## Matrices and arrays

In R, the matrix notation is extended to elements of any kind, so in example it is possible to have a matrix of character strings. Matrices and arrays are basically vectors with a dimension attribute.

The function matrix() may be used to create matrices. By default, such function creates the matrix by column; as alternative it is possible to specify to the function to build the matrix by row:

`> matrix(1:9,nrow=3,byrow=TRUE)     [,1] [,2] [,3][1,]    1    2    3[2,]    4    5    6[3,]    7    8    9`

## Lists

A list in R is a collection of different objects. One of the main advantages of lists is that the object contained within a list may be of different type, for example, numeric and character values. In order to define a list, you simply will need to provide the object that you want to include as argument of the function list().

## Data frame

A data frame corresponds to a data set; it is basically a special list in which the elements have the same length. Elements may be different type in different columns, but within the same column all the elements are of the same type. You can easily create data frames using the function data.frame(), and a specific column can be recall using the operator \$.

# Top features you’ll want to know about

In addition to the basic object creation and manipulation, many more complex tasks can be performed with R, spanning from data manipulation, programming, statistical analysis and the realization of very high quality graphs. Some of the most useful features are

• Data input and output
• Flow control (for, if…else, while)
• Create your own functions
• Debugging functions and handling exceptions
• Plotting data

# Summary

In this article we saw what is R, how to obtain and install R, and how to interacting with the console. We also saw at few R objects and also looked at the top features you would want to know about

## Resources for Article:

Further resources on this subject:

You've been reading an excerpt of: