Reader small image

You're reading from  Learning R Programming

Product typeBook
Published inOct 2016
Reading LevelBeginner
PublisherPackt
ISBN-139781785889776
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Kun Ren
Kun Ren
author image
Kun Ren

Kun Ren has used R for nearly 4 years in quantitative trading, along with C++ and C#, and he has worked very intensively (more than 8-10 hours every day) on useful R packages that the community does not offer yet. He contributes to packages developed by other authors and reports issues to make things work better. He is also a frequent speaker at R conferences in China and has given multiple talks. Kun also has a great social media presence. Additionally, he has substantially contributed to various projects, which is evident from his GitHub account: https://github.com/renkun-ken https://cn.linkedin.com/in/kun-ren-76027530 http://renkun.me/ http://renkun.me/formattable/ http://renkun.me/pipeR/ http://renkun.me/rlist/
Read more about Kun Ren

Right arrow

Chapter 2. Basic Objects

The first step of learning R programming is getting familiar with basic R objects and their behavior. In this chapter, you will learn the following topics:

  • Creating and subsetting atomic vectors (for example, numeric vectors, character vectors, and logical vectors), matrices, arrays, lists, and data frames.

  • Defining and working with functions

"Everything that exists is an object. Everything that happens is a function." -- John Chambers

For example, in statistical analysis, we often feed a set of data to a linear regression model and obtain a group of linear coefficients.

Provided that there are different types of objects in R, when we do this, what basically happens in R is that we provide a data frame object that holds the set of data, carry it to the linear model function and get a list object consisting of the properties of the regression results, and finally extract a numeric vector, which is another type of object, from the list to represent the linear coefficients...

Vector


A vector is a group of primitive values of the same type. It can be a group of numbers, true/false values, texts, and values of some other type. It is one of the building blocks of all R objects.

There are several types of vectors in R. They are distinct from each other in the type of elements they store. In the following sections, we will see the most commonly used types of vectors including numeric vectors, logical vectors, and character vectors.

Numeric vector

A numeric vector is a vector of numeric values. A scalar number is the simplest numeric vector. An example is shown as follows:

1.5
## [1] 1.5

A numeric vector is the most frequently used data type and is the foundation of nearly all kinds of data analysis. In other popular programming languages, there are some scalar types such as integer, double, and string, and these scalar types are the building blocks of the container types such as vectors. In R, however, there is no formal definition of scalar types. A scalar number...

Matrix


A matrix is a vector represented and accessible in two dimensions. Therefore, what applies to vectors is most likely to apply to a matrix. For example, each type of vector (for example, numeric vector or logical vectors) has its matrix version, that is, there are numeric matrices, logical matrices, and so on.

Creating a matrix

We can call matrix() to create a matrix from a vector by setting up one of its two dimensions:

matrix(c(1, 2, 3, 2, 3, 4, 3, 4, 5), ncol = 3)
##      [,1] [,2] [,3]
## [1,]   1    2    3
## [2,]   2    3    4
## [3,]   3    4    5

By specifying ncol = 3, we mean that the provided vector should be regarded as a matrix with 3 columns (and 3 rows automatically). You may feel the original vector is not as straightforward as its representation. To make the code more user-friendly, we can write the vector in multiple lines:

matrix(c(1, 2, 3,  4, 5, 6,  7, 8, 9), nrow = 3, byrow = FALSE)
##     [,1] [,2] [,3]
## [1,]  1    4    7
## [2,]  2    5    8
## [3,] ...

Array


An array is a natural extension to a matrix in its number of dimensions. More specifically, an array is a vector that is represented and accessible in a given number of dimensions (mostly more than two dimensions).

If you are already familiar with vectors and matrices, you won't be surprised to see how arrays behave.

Creating an array

To create an array, we call array() by supplying a vector of data, how this data is arranged in different dimensions, and sometimes the names of the rows and columns of these dimensions.

Suppose we have some data (10 integers from 0 to 9) and we need to arrange them in three dimensions: 1 for the first dimension, 5 for the second, and 2 for the third:

a1 <- array(c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), dim = c(1, 5, 2))
a1
## , , 1
## 
##     [,1] [,2] [,3] [,4] [,5]
## [1,]  0    1    2    3    4
## 
## , , 2
## 
##     [,1] [,2] [,3] [,4] [,5]
## [1,]  5    6    7    8    9

We can clearly see how we can access these entries by looking at the notations...

Lists


A list is a generic vector that is allowed to include different types of objects, even other lists.

It is useful for its flexibility. For example, the result of a linear model fit in R is basically a list object that contains rich results of a linear regression such as linear coefficients (numeric vectors), residuals (numeric vectors), QR decomposition (a list containing a matrix and other objects), and so on.

It is very handy to extract the information without calling different functions each time because these results are all packed into a list.

Creating a list

We can use list() to create a list, as the function name suggests. Different types of objects can be put into one list. For example, the following code creates a list that contains a single-element numeric vector, a two-entry logical vector, and a character vector of three values:

l0 <- list(1, c(TRUE, FALSE), c("a", "b", "c"))
l0
## [[1]]
## [1] 1
## 
## [[2]]
## [1] TRUE FALSE
## 
## [[3]]
## [1] "a" "b" "c"

We can assign...

Data frames


A data frame represents a set of data with a number of rows and columns. It looks like a matrix but its columns are not necessarily of the same type. This is consistent with the most commonly seen formats of datasets: each row, or data record, is described by multiple columns of various types.

The following table is an example that can be fully characterized by a data frame.

Name

Gender

Age

Major

Ken

Male

24

Finance

Ashley

Female

25

Statistics

Jennifer

Female

23

Computer Science

Creating a data frame

To create a data frame, we can call data.frame() and supply the data of each column by a vector of the corresponding type:

persons <- data.frame(Name = c("Ken", "Ashley", "Jennifer"),
  Gender = c("Male", "Female", "Female"),
  Age = c(24, 25, 23),
  Major = c("Finance", "Statistics", "Computer Science"))
persons
##   Name     Gender  Age  Major
## 1 Ken      Male    24   Finance
## 2 Ashley   Female  25   Statistics
## 3 Jennifer Female  23   Computer Science...

Functions


A function is an object you can call. Basically, it is a machine with internal logic that takes a group of inputs (parameters or arguments) and returns a value as output.

In the previous sections, we encountered some built-in functions of R. For example, is.numeric() takes an argument that can be any R object and returns a logical value that indicates whether the object is a numeric vector. Similarly, is.function() can tell whether a given R object is a function object.

In fact, in R environment, everything we use is an object, everything we do is a function, and, maybe to your surprise, all functions are still objects. Even <- and + are both functions that take two arguments. Although they are called binary operators, they are essentially functions.

When we do casual, interactive data analysis, at times, we won't have to write any function on our own since the built-in functions and those provided by thousands of packages are usually enough.

However, if you need to repeat your...

Summary


In this chapter, you learned the basic behaviors of numeric vectors, logical vectors, and character vectors. These vectors are homogeneous data types that can only store elements of the same type. By contrast, lists and data frames are more flexible since they store elements of different types. You learned how to subset these data structures and extract an element from them. Finally, you learned about creating and calling functions.

Now you know the rules of the game, you need to get familiar with the playground. In the next chapter, we will cover some basic yet important things about managing the workspace. I will show you some common practices of managing the working directory, the environment, and the library of packages.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Learning R Programming
Published in: Oct 2016Publisher: PacktISBN-13: 9781785889776
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at ₹800/month. Cancel anytime

Author (1)

author image
Kun Ren

Kun Ren has used R for nearly 4 years in quantitative trading, along with C++ and C#, and he has worked very intensively (more than 8-10 hours every day) on useful R packages that the community does not offer yet. He contributes to packages developed by other authors and reports issues to make things work better. He is also a frequent speaker at R conferences in China and has given multiple talks. Kun also has a great social media presence. Additionally, he has substantially contributed to various projects, which is evident from his GitHub account: https://github.com/renkun-ken https://cn.linkedin.com/in/kun-ren-76027530 http://renkun.me/ http://renkun.me/formattable/ http://renkun.me/pipeR/ http://renkun.me/rlist/
Read more about Kun Ren