R Object-oriented Programming

3 (2 reviews total)
By Kelly Black
  • Instant online access to over 8,000+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies

About this book

R is best suited to produce data and visual analytics through customizable scripts and commands, instead of typical statistical tools that provide tick boxes and drop-down menus for users. The book is divided into three parts to help you perform these steps. It starts by providing you with an overview of the basic data types, data structures, and tools available in R that are used to solve common tasks. It then moves on to offer insights and examples on object-oriented programming with R; this includes an introduction to the basic control structures available in R with examples. It also includes details on how to implement S3 and S4 classes. Finally, the book provides three detailed examples that demonstrate how to bring all of these ideas together.

Publication date:
October 2014
Publisher
Packt
Pages
190
ISBN
9781783986682

 

Chapter 1. Data Types

In this chapter, we provide a broad overview of the different data types available in the R environment. This material is introductory in nature, and this chapter ensures that important information on implementing algorithms is available to you. There are roughly five parts in this chapter:

  • Working with variables in the R environment: This section gives you a broad overview of interacting with the R shell, creating variables, deleting variables, saving variables, and loading variables

  • Discrete data types: This section gives you an overview of the principle data types used to represent discrete data

  • Continuous data types: This section gives you an overview of the principle data types used to represent continuous data

  • Introduction to vectors: This section gives you an introduction to vectors and manipulating vectors in R

  • Special data types: This section gives you a list of other data types that do not fit in the other categories or have other meanings

 

Assignment


The R environment is an interactive shell. Commands are entered using the keyboard, and the environment should feel familiar to anyone used to MATLAB or the Python interactive interpreter. To assign a value to a variable, you can usually use the = symbol in the same way as these other interpreters. The difference with R, however, is that there are other ways to assign a variable, and their behavior depends on the context.

Another way to assign a value to a variable is to use the <- symbols (sometimes called operators). At first glance, it seems odd to have different ways to assign a value, but we will see that variables can be saved in different environments. The same name may be used in different environments, and the name can be ambiguous. We will adopt the use of the <- operator in this text because it is the most common operator, and it is also the least likely to cause confusion in different contexts.

The R environment manages memory and variable names dynamically. To create a new variable, simply assign a value to it, as follows:

> a <- 6
> a
[1] 6

A variable has a scope, and the meaning of a variable name can vary depending on the context. For example, if you refer to a variable within a function (think subroutine) or after attaching a dataset, then there may be multiple variables in the workspace with the same name. The R environment maintains a search path to determine which variable to use, and we will discuss these details as they arise.

The <- operator for the assignment will work in any context while the = operator only works for complete expressions. Another option is to use the <<- operator. The advantage of the <<- operator is that it instructs the R environment to search parent environments to see whether the variable already exists. In some contexts, within a function for example, the <- operator will create a new variable; however, the <<- operator will make use of an existing variable outside of the function if it is found.

Another way to assign variables is to use the -> and ->> operators. These operators are similar to those given previously. The only difference is that they reverse the direction of assignment, as follows:

> 14.5 -> a
> 1/12.0 ->> b
> a
[1] 14.5
> b
[1] 0.08333333
 

The workspace


The R environment keeps track of variables as well as allocates and manages memory as it is requested. One command to list the currently defined variables is the ls command. A variable can be deleted using the rm command. In the following example, the a and b variables have been changed, and the a variable is deleted:

> a <-  17.5
> b <- 99/4
> ls()
[1] "a" "b"
> objects()
[1] "a" "b"
> rm(a)
> ls()
[1] "b"

If you wish to delete all of the variables in the workspace, the list option in the rm command can be combined with the ls command, as follows:

> ls()
[1] "b"
> rm(list=ls())
> ls()
character(0)

A wide variety of other options are available. For example, there are directory options to show and set the current directory, as follows:

> getwd()
[1] "/home/black"
> setwd("/tmp")
> getwd()
[1] "/tmp"
> dir()
 [1] "antActivity.R"             "betterS3.R"               
 [3] "chiSquaredArea.R"          "firstS3.R"                
[5] "math100.csv"               "opsTesting.R"             
[7] "probabilityExampleOne.png" "s3.R"                      
[9] "s4Example.R"

Another important task is to save and load a workspace. The save and save.image commands can be used to save the current workspace. The save command allows you to save a particular variable, and the save.image command allows you to save the entire workspace. The usage of these commands is as follows:

> save(a,file="a.RData")
> save.image("wholeworkspace.Rdata")

These commands have a variety of options. For example, the ascii option is a commonly used option to ensure that the data file is in a (nearly) human-readable form. The help command can be used to get more details and see more of the options that are available. In the following example, the variable a is saved in a file, a.RData, and the file is saved in a human-readable format:

> save(a,file="a.RData",ascii=TRUE)
> save.image(" wholeworkspace.RData",ascii=TRUE)
> help(save)

As an alternative to the help command, the ? operator can also be used to get the help page for a given command. An additional command is the help.search command that is used to search the help files for a given string. The ?? operator is also available to perform a search for a given string.

The information in a file can be read back into the workspace using the load command:

> load("a.RData")
> ls()
[1] "a"
> a
[1] 19

Another question that arises with respect to a variable is how it is stored. The two commands to determine this are mode and storage.mode. You should try to use these commands for each of the data types described in the following subsections. Basically, these commands can make it easier to determine whether a variable is a numeric value or another basic data type.

The previous commands provide options for saving the values of the variables within a workspace. They do not save the commands that you have entered. These commands are referred to as the history within the R workspace, and you can save your history using the savehistory command. The history can be displayed using the history command, and the loadhistory command can be used to replay the commands in a file.

The last command given here is the command to quit, q(). Some people consider this to be the most important command because without it you would never be able to leave R. The rest of us are not sure why it is necessary.

 

Discrete data types


One of the features of the R environment is the rich collection of data types that are available. Here, we briefly list some of the built-in data types that describe discrete data. The four data types discussed are the integer, logical, character, and factor data types. We also introduce the idea of a vector, which is the default data structure for any variable. A list of the commands discussed here is given in Table 2 and Table 3.

It should be noted that the default data type in R, for a number, is a double precision number. Strings can be interpreted in a variety of ways, usually as either a string or a factor. You should be careful to make sure that R is storing information in the format that you want, and it is important to double-check this important aspect of how data is tracked.

Integer

The first discrete data type examined is the integer type. Values are 32-bit integers. In most circumstances, a number must be explicitly cast as being an integer, as the default type in R is a double precision number. There are a variety of commands used to cast integers as well as allocate space for integers. The integer command takes a number for an argument and will return a vector of integers whose length is given by the argument:

> bubba <- integer(12)
> bubba
 [1] 0 0 0 0 0 0 0 0 0 0 0 0
> bubba[1]
[1] 0
> bubba[2]
[1] 0
> bubba[[4]]
[1] 0
>  b[4] <- 15
> b
 [1]  0  0  0 15  0  0  0  0  0  0  0  0

In the preceding example, a vector of twelve integers was defined. The default values are zero, and the individual entries in the vector are accessed using braces. The first entry in the vector has index 1, so in this example, bubba[1] refers to the initial entry in the vector. Note that there are two ways to access an element in the vector: single versus double braces. For a vector, the two methods are nearly the same, but when we explore the use of lists as opposed to vectors, the meaning will change. In short, the double braces return objects of the same type as the elements within the vector, and the single braces return values of the same type as the variable itself. For example, using single braces on a list will return a list, while double braces may return a vector.

A number can be cast as an integer using the as.integer command. A variable's type can be checked using the typeof command. The typeof command indicates how R stores the object and is different from the class command, which is an attribute that you can change or query:

> as.integer(13.2)
[1] 13
> thisNumber <- as.integer(8/3)
> typeof(thisNumber)
[1] "integer"

Note that a sequence of numbers can be automatically created using either the : operator or the seq command:

> 1:5
[1] 1 2 3 4 5
> myNum <- as.integer(1:5)


> myNum[1]
[1] 1
> myNum[3]
[1] 3

> seq(4,11,by=2)
[1]  4  6  8 10
> otherNums <- seq(4,11,by=2)


> otherNums[3]
[1] 8

A common task is to determine whether or not a variable is of a certain type. For integers, the is.integer command is used to determine whether or not a variable has an integer type:

> a <- 1.2
> typeof(a)
[1] "double"
> is.integer(a)
[1] FALSE

> a <- as.integer(1.2)
> typeof(a)
[1] "integer"
> is.integer(a)
[1] TRUE

Logical

Logical data consists of variables that are either true or false. The words TRUE and FALSE are used to designate the two possible values of a logical variable. (The TRUE value can also be abbreviated to T, and the FALSE value can be abbreviated to F.) The basic commands associated with logical variables are similar to the commands for integers discussed in the previous subsection. The logical command is used to allocate a vector of Boolean values. In the following example, a logical vector of length 10 is created. The default value is FALSE, and the Boolean not operator is used to flip the values to evaluate to TRUE:

> b <- logical(10)
> b
 [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> b[3]
[1] FALSE
> !b
 [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
> !b[5]
[1] TRUE
> typeof(b)
[1] "logical"
> mode(b)
[1] "logical"
> storage.mode(b)
[1] "logical"
>  b[3] <- TRUE
> b
 [1] FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

To cast a value to a logical type, you can use the as.logical command. Note that zero is mapped to a value of FALSE and other numbers are mapped to a value of TRUE:

> a <- -1:1
> a
[1] -1  0  1
> as.logical(a)
[1]  TRUE FALSE  TRUE

To determine whether or not a value has a logical type, you use the is.logical command:

> b <- logical(4)
> b
[1] FALSE FALSE FALSE FALSE
> is.logical(b)
[1] TRUE

The standard operators for logical operations are available, and a list of some of the more common operations is given in Table 1. Note that there is a difference between operations such as & and &&. A single & is used to perform an and operation on each pairwise element of two vectors, while the double && returns a single logical result using only the first elements of the vectors:

> l1 <- c(TRUE,FALSE)
> l2 <- c(TRUE,TRUE)
> l1&l1
[1]  TRUE FALSE
> l1&&l1
[1] TRUE
> l1|l2
[1] TRUE TRUE
> l1||l2
[1] TRUE

Tip

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. An additional source for the examples in this book can be found at https://github.com/KellyBlack/R-Object-Oriented-Programming. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

The following table shows various logical operators and their description:

Logical Operator

Description

<

Less than

>

Greater that

<=

Less than or equal to

>=

Greater than or equal to

==

Equal to

!=

Not equal to

|

Entrywise or

||

Or

!

Not

&

Entrywise and

&&

And

xor(a,b)

Exclusive or

Table 1 – list of operators for logical variables

Character

One common way to store information is to save data as characters or strings. Character data is defined using either single or double quotes:

> a <- "hello"
> a
[1] "hello"
> b <- 'there'
> b
[1] "there"
> typeof(a)
[1] "character"

The character command can be used to allocate a vector of character-valued strings, as follows:

> many <- character(3)
> many
[1] "" "" ""
> many[2] <- "this is the second"
> many[3] <- 'yo, third!'
> many[1] <- "and the first"
> many
[1] "and the first"      "this is the second" "yo, third!"        

A value can be cast as a character using the as.character command, as follows:

> a <- 3.0
> a
[1] 3
> b <- as.character(a)
> b
[1] "3"

Finally, the is.character command takes a single argument, and it returns a value of TRUE if the argument is a string:

> a <- as.character(4.5)
> a
[1] "4.5"
> is.character(a)
[1] TRUE

Factors

Another common way to record data is to provide a discrete set of levels. For example, the results of an individual trial in an experiment may be denoted by a value of a, b, or c. Ordinal data of this kind is referred to as a factor in R. The commands and ideas are roughly parallel to the data types described previously. There are some subtle differences with factors, though. Factors are used to designate different levels and can be considered ordered or unordered. There are a large number of options, and it is wise to consult the help pages for factors using the (help(factor)) command. One thing to note, though, is that the typeof command for a factor will return an integer.

Factors can be defined using the factor command, as follows:

> lev <- factor(x=c("one","two","three","one"))
> lev
[1] one   two   three one  
Levels: one three two
> levels(lev)
[1] "one"   "three" "two"  
> sort(lev)
[1] one   one   two   three
Levels: one two three

>  lev <- factor(x=c("one","two","three","one"),levels=c("one","two","three"))
> lev
[1] one   two   three one  
Levels: one two three
> levels(lev)
[1] "one"   "two"   "three"
> sort(lev)
[1] one   one   two   three
Levels: one two three

The techniques used to cast a variable to a factor or test whether a variable is a factor are similar to the previous examples. A variable can be cast as a factor using the as.factor command. Also, the is.factor command can be used to determine whether or not a variable has a type of factor.

Continuous data types

The data types for continuous data types are given here. The double and complex data types are given. A list of the commands discussed here is given in Table 2 and Table 3.

Double

The default numeric data type in R is a double precision number. The commands are similar to those of the integer data type discussed previously. The double command can be used to allocate a vector of double precision numbers, and the numbers within the vector are accessed using braces:

> d <- double(8)
> d
[1] 0 0 0 0 0 0 0 0
> typeof(d)
[1] "double"
> d[3] <- 17
> d
[1]  0  0 17  0  0  0  0  0

The techniques used to cast a variable to a double precision number and test whether a variable is a double precision number are similar to the examples seen previously. A variable can be cast as a double precision number using the as.double command. Also, to determine whether a variable is a double precision number, the as.double command can be used.

Complex

Arithmetic for complex numbers is supported in R, and most math functions will react properly when given a complex number. You can append i to the end of a number to force it to be the imaginary part of a complex number, as follows:

> 1i
[1] 0+1i
> 1i*1i
[1] -1+0i
> z <- 3+2i
> z
[1] 3+2i
> z*z
[1] 5+12i
> Mod(z)
[1] 3.605551
> Re(z)
[1] 3
> Im(z)
[1] 2
> Arg(z)
[1] 0.5880026
> Conj(z)
[1] 3-2i

The complex command can also be used to define a vector of complex numbers. There are a number of options for the complex command, so a quick check of the help page, (help(complex)), is recommended:

> z <- complex(3)
> z
[1] 0+0i 0+0i 0+0i
> typeof(z)
[1] "complex"
> z <- complex(real=c(1,2),imag=c(3,4))
> z
[1] 1+3i 2+4i
> Re(z)
[1] 1 2

The techniques to cast a variable to a complex number and to test whether or not a variable is a complex number are similar to the methods seen previously. A variable can be cast as complex using the as.complex command. Also, to test whether or not a variable is a complex number, the as.complex command can be used.

Special data types

There are two other common data types that occur that are important. We will discuss these two data types and provide a note about objects. The two data types are NA and NULL. These are brief comments, as these are recurring topics that we will revisit many times.

The first data type is a constant, NA. This is a type used to indicate a missing value. It is a constant in R, and a variable can be tested using the is.na command, as follows:

> n <- c(NA,2,3,NA,5)
> n
[1] NA  2  3 NA  5
> is.na(n)
[1]  TRUE FALSE FALSE  TRUE FALSE
> n[!is.na(n)]
[1] 2 3 5

Another special type is the NULL type. It has the same meaning as the null keyword in the C language. It is not an actual type but is used to determine whether or not an object exists:

> a <- NULL
> typeof(a)
[1] "NULL"

Finally, we'll quickly explore the term objects. The variables that we defined in all of the preceding examples are treated as objects within the R environment. When we start writing functions and creating classes, it will be important to realize that they are treated like variables. The names used to assign variables are just a shortcut for R to determine where an object is located.

For example, the complex command is used to allocate a vector of complex values. The command is defined to be a set of instructions, and there is an object called complex that points to those instructions:

> complex
function (length.out = 0L, real = numeric(), imaginary = numeric(),
    modulus = 1, argument = 0)
{
    if (missing(modulus) && missing(argument)) {
        .Internal(complex(length.out, real, imaginary))
    }
    else {
        n <- max(length.out, length(argument), length(modulus))
        rep_len(modulus, n) * exp((0+1i) * rep_len(argument,
            n))
    }
}
<bytecode: 0x2489c80>
<environment: namespace:base>

There is a difference between calling the complex()function and referring to the set of instructions located at complex.

Notes on the as and is functions

Two common tasks are to determine whether a variable is of a given type and to cast a variable to different types. The commands to determine whether a variable is of a given type generally start with the is prefix, and the commands to cast a variable to a different type generally start with the as prefix. The list of commands to determine whether a variable is of a given type are given in the following table:

Type to check

Command

Integer

is.integer

Logical

is.logical

Character

is.character

Factor

is.factor

Double

is.double

Complex

is.complex

NA

is.NA

List

is.list

Table 2 – commands to determine whether a variable is of a particular type

The commands used to cast a variable to a different type are given in Table 3. These commands take a single argument and return a variable of the given type. For example, the as.character command can be used to convert a number to a string.

The commands in the previous table are used to test what type a variable has. The following table provides the commands that are used to change a variable of one type to another type:

Type to convert to

Command

Integer

as.integer

Logical

as.logical

Character

as.character

Factor

as.factor

Double

as.double

Complex

as.complex

NA

as.NA

List

as.list

Table 3 – commands to cast a variable into a particular type

 

Summary


In this chapter, we examined some of the data types available in the R environment. These include discrete data types such as integers and factors. It also includes continuous data types such as real and complex data types. We also examined ways to test a variable to determine what type it is.

In the next chapter, we look at the data structures that can be used to keep track of data. This includes vectors and data types such as lists and data frames that can be constructed from vectors.

About the Author

  • Kelly Black

    Kelly Black is a faculty member in the Department of Mathematics at Clarkson University. His background is in numerical analysis with a focus on the use of spectral methods and stochastic differential equations. He makes extensive use of R in the analysis of the results of Monte-Carlo simulations.

    In addition to using R for his research interests, Kelly also uses the R environment for his statistics classes. He has extensive experience sharing his experiences with R in the classroom. The use of R to explore datasets is an important part of the curriculum.

    Browse publications by this author

Latest Reviews

(2 reviews total)
Non è fatto molto bene e non tratta le classi R6
Good
Book Title
Unlock this full book with a FREE 10-day trial
Start Free Trial