Instant R Starter [Instant] — Save 50%
Jump into the R programming language and go beyond "Hello World!" with this book and ebook
In this article created by Donato Teutonico, the author of Instant R Starter, we will learn the features of R. R is a high-level language and an environment for data analysis and visualization. It provides an environment in which you can perform statistical analysis and produce high-quality graphics. It is actually a complete programming language, derived from the statistical programming language S. It has been developed and is maintained by a core of statistical programmers, with the support of a large community of users. It is most widely used for statistical computing and graphics, but is a fully functional programming language well suited to scientific programming, data management and data visualization in general.
The interaction with the R system is mainly command-driven, with the user typing in text and asking R to execute the specific command. As soon as you start R, a session will be opened, on which the commands may be introduced and the expression executed. Complex procedures can be implemented in scripts, which are executed as soon as they are loaded into the system or, more efficiently, as function, which can be loaded in the system and used when needed.
(For more resources related to this topic, see here.)
Obtaining and installing R
The way to obtain R is downloading it from the CRAN website (http://www.r-project.org/). The Comprehensive R Archive Network (CRAN) is a network of FTP and web servers around the world that stores identical, up-to-date versions of code and documentation for R. The CRAN is directly accessible from the R website and on such website it is also possible to find information about R, some technical manuals, the R journal, and details about the packages developed for R and stored on the CRAN repositories.
The functionalities of the R environment can then also be expanded thanks to software libraries which can be installed and recalled if needed. These libraries or packages are a collection of source code and other additional files that, when installed in R, allow the user to load them in the workspace via a call to the library() function. An example of code to load the package lattice may be found as follows:
An R installation contains one or more libraries of packages. Some of these packages are part of the basic installation and are loaded automatically as soon as the session is started. Other can be installed from the CRAN, the official R repository, or downloaded and installed manually.
Interacting with the console
As soon as you will start R, you will see that a workspace is open; you can see a screenshot of the R Console window in the image below. The workspace is the environment in which you are working, where you will load your data, and create your variables.
The screen prompt > is the R prompt that waits for commands. On the starting screen, you can either type any function, command, or you can use R to perform basic calculation. R uses the usual symbols for addition (+), subtraction (-), multiplication (*), division (/), and exponentiation (^). Parentheses ( ) can be used to specify the order of operations. R also provides %% for taking the modulus and %/% for integer division. Comments in R are defined by the character #, so everything after such character up to the end of the line will be ignored by R.
R has a number of built-in functions, for example, sin(x), cos(x), tan(x), (all in radians), exp(x), log(x), and sqrt(x). Some special constants such as pi are also pre-defined. You can see an example of the use of such function in the following code:
Understanding R objects
In every computer language, variables provide a means of accessing the data stored in memory. R does not provide direct access to the computer’s memory but rather provides a number of specialized data structures called objects. These objects are referred to through symbols or variables.
The basic object in R is the vector; even scalars are vectors of length one. Vectors can be thought of as a series of data of the same class. There are six basic vector type (called atomic vectors): logical, integer, real, complex, string (or character), and raw. Integer and real represent numeric objects; logicals are Boolean data type with possible value TRUE or FALSE. Among such atomic vectors, the more common ones are logical, string, and numeric (integer and real).
There are several ways to create vectors. For instance the operator : (colon) is a sequence-generating operator, it creates sequences by incrementing or decrementing by one.
 1 2 3 4 5 6 7 8 9 10
 5 4 3 2 1 0 -1 -2 -3 -4 -5 -6
If the interval between the numbers is not one, you can use the seq() function. Here an example
> seq(from=2, to=2.5, by=0.1)
 2.0 2.1 2.2 2.3 2.4 2.5
One of the more important features of R is the possibility to use entire vector as arguments of functions, thus avoiding the use of cyclic loops. Most of the functions in R allow the use of vector as argument, as example the use of some of these functions is reported as follows
> x <- c(12,10,4,6,9)
Matrices and arrays
In R, the matrix notation is extended to elements of any kind, so in example it is possible to have a matrix of character strings. Matrices and arrays are basically vectors with a dimension attribute.
The function matrix() may be used to create matrices. By default, such function creates the matrix by column; as alternative it is possible to specify to the function to build the matrix by row:
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
A list in R is a collection of different objects. One of the main advantages of lists is that the object contained within a list may be of different type, for example, numeric and character values. In order to define a list, you simply will need to provide the object that you want to include as argument of the function list().
A data frame corresponds to a data set; it is basically a special list in which the elements have the same length. Elements may be different type in different columns, but within the same column all the elements are of the same type. You can easily create data frames using the function data.frame(), and a specific column can be recall using the operator $.
Top features you’ll want to know about
In addition to the basic object creation and manipulation, many more complex tasks can be performed with R, spanning from data manipulation, programming, statistical analysis and the realization of very high quality graphs. Some of the most useful features are
- Data input and output
- Flow control (for, if…else, while)
- Create your own functions
- Debugging functions and handling exceptions
- Plotting data
In this article we saw what is R, how to obtain and install R, and how to interacting with the console. We also saw at few R objects and also looked at the top features you would want to know about
Resources for Article:
- Organizing, Clarifying and Communicating the R Data Analyses [Article]
- Customizing Graphics and Creating a Bar Chart and Scatterplot in R [Article]
- Graphical Capabilities of R [Article]
About the Author :
Donato Teutonico has several years of experience in modeling and simulation of drug effects and clinical trials in industrial and academic settings. He received his PharmD degree from the University of Turin, Italy, with a specialization in Chemical and Pharmaceutical Technology, and received his Ph.D. in Pharmaceutical Sciences from the Paris-South University, France.
He is the author of two R packages for Pharmacometrics; CTStemplate and panels4Pharmacometrics; both are available on Google code.