Home Data Learning Quantitative Finance with R

Learning Quantitative Finance with R

By Dr. Param Jeet , PRASHANT VATS
books-svg-icon Book
eBook $43.99 $29.99
Print $54.99
Subscription $15.99 $10 p/m for three months
$10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
BUY NOW $10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
eBook $43.99 $29.99
Print $54.99
Subscription $15.99 $10 p/m for three months
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
  1. Free Chapter
    Introduction to R
About this book
The role of a quantitative analyst is very challenging, yet lucrative, so there is a lot of competition for the role in top-tier organizations and investment banks. This book is your go-to resource if you want to equip yourself with the skills required to tackle any real-world problem in quantitative finance using the popular R programming language. You'll start by getting an understanding of the basics of R and its relevance in the field of quantitative finance. Once you've built this foundation, we'll dive into the practicalities of building financial models in R. This will help you have a fair understanding of the topics as well as their implementation, as the authors have presented some use cases along with examples that are easy to understand and correlate. We'll also look at risk management and optimization techniques for algorithmic trading. Finally, the book will explain some advanced concepts, such as trading using machine learning, optimizations, exotic options, and hedging. By the end of this book, you will have a firm grasp of the techniques required to implement basic quantitative finance models in R.
Publication date:
March 2017
Publisher
Packt
Pages
284
ISBN
9781786462411

 

Chapter 1.  Introduction to R

In this chapter, we will be discussing basic R concepts. This will serve as the background for upcoming chapters. We are not going to discuss each and every concept in detail for R. This chapter is meant for people who do not have any knowledge of the R language or beginners who are looking to pursue a career in quantitative finance or want to use R for quantitative financial analysis. This chapter can give you a start in learning how to write programs in R, and for writing complex programs, you can explore other books.

This chapter covers the following topics:

  • The need for R

  • How to download/install R

  • How to install packages

  • Data types

  • Import and export of different data types

  • How to write code expressions

  • Functions

  • How to execute R programs

  • Loops (for, while, if, and if...else)

 

The need for R


There are so many statistical packages which can be used for solving problems in quantitative finance. But R is not a statistical package but it is a language. R is a flexible and powerful language for achieving high-quality analysis.

To use R, one does not need to be a programmer or computer-subject expert. The knowledge of basic programming definitely helps in learning R, but it is not a prerequisite for getting started with R.

One of the strengths of R is its package system. It is vast. If a statistical concept exists, chances are that there is already a package for it in R. There exist many functionalities that come built in for statistics / quantitative finance.

R is extendable and provides plenty of functionalities which encourage developers in quant finance to write their own tools or methods to solve their analytical problems.

The graphing and charting facilities present in R are unparalleled. R has a strong relationship with academia. As new research gets published, the likelihood is that a package for the new research gets added, due to its open source nature, which keeps R updated with the new concepts emerging in quant finance.

R was designed to deal with data, but when it came into existence, big data was nowhere in the picture. Additional challenges dealing with big data are the variety of data (text data, metric data, and so on), data security, memory, CPU I/O RSC requirements, multiple machines, and so on. Techniques such as map-reducing, in-memory processing, streaming data processing, down-sampling, chunking, and so on are being used to handle the challenges of big data in R.

Furthermore, R is free software. The development community is fantastic and easy to approach, and they are always interested in developing new packages for new concepts. There is a lot of documentation available on the Internet for different packages of R.

Thus, R is a cost-effective, easy-to-learn tool. It has very good data handling, graphical, and charting capabilities. It is a cutting-edge tool as, due to its open nature, new concepts in finance are generally accompanied by new R packages. It is demand of time for people pursuing a career in quantitative finance to learn R.

 

How to download/install R


In this section, we are going to discuss how to download and install R for various platforms: Windows, Linux, and Mac.

Open your web browser and go to the following link: https://cran.rstudio.com/.

From the given link, you can download the required version according to the available operating system.

For the Windows version, click on Download R for Windows, and then select the base version and download Download R 3.3.1 for Windows for your Windows operating system, click on it, and select your favorite language option. Now click through the installer and it will take you through various options, such as the following:

  1. Setup Wizard.

  2. License Agreement.

  3. Select folder location where you want to install.

  4. Select the component. Select the option according to the configuration of your system; if you do not know the configuration of your system, then select all the options.

  5. If you want to customize your setup, select the option.

  6. Select the R launch options and desktop shortcut options according to your requirements.

R download and installation is complete for Windows.

Similarly, you click on your installer for Linux and Mac and it will take you through various options of installation.

 

How to install packages


R packages are a combination of R functions, compiled code, and sample data, and their storage directory is known as a library. By default, when R is installed, a set of packages gets installed and the rest of the packages you have to add when required.

A list of commands is given here to check which packages are present in your system:

>.libPaths()

The preceding command is used for getting or setting the library trees that R knows about. It gives the following result:

"C:/Program Files/R/R-3.3.1/library"

After this, execute the following command and it will list all the available packages:

>library()

There are two ways to install new packages.

Installing directly from CRAN

CRAN stands for Comprehensive R Archive Network. It is a network of FTP web servers throughout the globe for storing identical, up-to-date versions of code and documentation for R.

The following command is used to install the package directly from the CRAN web page. You need to choose the appropriate mirror:

>install.packages("Package")

For example, if you need to install the ggplot2 or forecast package for R, the commands are as follows:

>install.packages("ggplot2")
>install.packages("forecast")

Installing packages manually

Download the required R package manually and save the ZIP version at your designated location (let's say /DATA/RPACKAGES/) on the system.

For example, if we want to install ggplot2, then run the following command to install it and load it to the current R environment. Similarly, other packages can also be installed:

>install.packages("ggplot2", lib="/data/Rpackages/")
>library(ggplot2, lib.loc="/data/Rpackages/")
 

Data types


In any programming language, one needs to store various pieces of information using various variables. Variables are reserved memory locations for storing values. So by creating a variable, one is reserving some space in the memory. You may like to store various types of data types, such as character, floating point, Boolean, and so on. On the basis of data type, the operating system allocates memory and decides what can be stored in reserved memory.

All the things you encounter in R are called objects.

R has five types of basic objects, also known as atomic objects, and the rest of the objects are built on these atomic objects. Now we will give an example of all the basic objects and will verify their class:

  • Character:

    We assign a character value to a variable and verify its class:

            >a <- "hello"
            >print(class(a))
    

    The result produced is as follows:

            [1] "character"
    
  • Numeric:

    We assign a numeric value to a variable and verify its class:

            >a <- 2.5
            >print(class(a))
    

    The result produced is as follows:

            [1] "numeric"
    
  • Integer:

    We assign an integer value to a variable and verify its class:

            >a <- 6L
            >print(class(a))
    

    The result produced is as follows:

            [1] "integer"
    
  • Complex:

    We assign an integer value to a variable and verify its class:

            >a <- 1 + 2i
            >print(class(a))
    

    The result produced is as follows:

            [1] "complex"
    
  • Logical (True/false):

    We assign an integer value to a variable and verify its class:  

            >a <- TRUE
    >print(class(a))
    

    Then the result produced is as follows:

    [1] "logical"
    

The basic types of objects in R are known as vectors and they consist of similar types of objects. They cannot consist of two different types of objects at the same time, such as a vector consisting of both character and numeric.

But list is an exception, and it can consist of multiple classes of objects at the same time. So a list can simultaneously contain a character, a numeric, and a list.

Now we will discuss the common data types present in R and give at least one example for each data type discussed here.

Vectors

Vectors have already been defined. If we want to construct a vector with more than one element, we can use the c() function which combines the elements into a vector, for example:

>a<-"Quantitative" 
>b<-"Finance" 
>c(a,b) 

This produces the following result:

[1] "Quantitative" "Finance"   

Similarly:

>Var<-c(1,2,3) 
>Var 

This produces the following result:

[1] 1 2 3 

Lists

A list is an R object that consists of multiple types of objects inside it, such as vectors and even lists. For example, let's construct a list and print it using code:

#Create a List and print it 
>List1 = list(c(4,5,6),"Hello", 24.5) 
>print(List1) 

When we execute the previous command, it produces the following result:

[[1]] 
[1] 4 5 6 
   
[[2]] 
[1] "Hello" 
 
[[3]] 
[1] 24.5 

We can extract the individual elements of the list according to our requirements.

For example, in the preceding case, if we want to extract the second element:

>print(List1[2]) 

Upon executing the preceding code, R creates the following output:

[[1]] 
[1] "Hello" 

One can merge the two lists using the function c(); for example:

>list1 <- list(5,6,7) 
>list2 <- list("a","b","c") 
>Combined_list <-c(list1,list2) 
>print(Combined_list) 

Upon executing the preceding command, we get the combined list:

[[1]] 
[1] 5 
 
[[2]] 
[1] 6 
 
[[3]] 
[1] 7 
 
[[4]] 
[1] "a" 
 
[[5]] 
[1] "b" 
 
[[6]] 
[1] "c" 

Matrices

A matrix is a two-dimensional rectangular dataset, and it is created by vector input to the matrix() function.

For example, create a matrix with two rows and three columns, and print it:

>M <- matrix(c(1,2,3,4,5,6), nrow = 2, ncol = 3) 
>print(M) 

When we execute the preceding code, it produces the following result:

     [,1] [,2] [,3] 
[1,]    1    3    5 
[2,]    2    4    6 

Arrays

Matrices are confined to only two dimensions, but arrays can be of any dimension. The array() function takes a dim attribute, which creates the needed dimensions.

For example, create an array and print it:

>a <- array(c(4,5),dim = c(3,3,2)) 
>print(a) 

When we execute the previous code, it produces the following result:

, , 1 
     [,1] [,2] [,3] 
[1,]    4    5    4 
[2,]    5    4    5 
[3,]    4    5    4 
 
, , 2 
 
     [,1] [,2] [,3] 
[1,]    5    4    5 
[2,]    4    5    4 
[3,]    5    4    5 

Factors

Factors are R objects that are created using a vector. It stores the vector along with the distinct elements present in the vector as labels. Labels are always in character form, irrespective of whether it is numeric, character, or Boolean.

Factors are created using the factor() function, and the count of levels is given by n levels; for example:

>a <-c(2,3,4,2,3) 
>fact <-factor(a) 
>print(fact) 
>print(nlevels(fact)) 

When the preceding code gets executed, it generates the following results:

[1] 2 3 4 2 3 
Levels: 2 3 4 
[1] 3 

DataFrames

DataFramesare tabular-form data objects where each column can be of different form, that is, numeric, character, or logical. Each column consists of a list of vectors having the same length.

DataFrames are generated using the function data.frame(); for example:

>data <-data.frame( 
>+Name = c("Alex", "John", "Bob"), 
>+Age = c(18,20,23), 
>+Gender =c("M","M","M") 
>+) 
>print(data) 

When the preceding code gets executed, it generates the following result:

  Name Age Gender 
1 Alex  18      M 
2 John  20      M 
3  Bob  23      M 
 

Importing and exporting different data types


In R, we can read the files stored from outside the R environment. We can also write the data into files which can be stored and accessed by the operating system. In R, we can read and write different formats of files, such as CSV, Excel, TXT, and so on. In this section, we are going to discuss how to read and write different formats of files.

The required files should be present in the current directory to read them. Otherwise, the directory should be changed to the required destination.

The first step for reading/writing files is to know the working directory. You can find the path of the working directory by running the following code:

>print (getwd()) 

This will give the paths for the current working directory. If it is not your desired directory, then please set your own desired directory by using the following code:

>setwd("") 

For instance, the following code makes the folder C:/Users the working directory:

  >setwd("C:/Users") 

How to read and write a CSV format file

A CSV format file is a text file in which values are comma separated. Let us consider a CSV file with the following content from stock-market data:

Date

Open

High

Low

Close

Volume

Adj Close

14-10-2016

2139.68

2149.19

2132.98

2132.98

3.23E+09

2132.98

13-10-2016

2130.26

2138.19

2114.72

2132.55

3.58E+09

2132.55

12-10-2016

2137.67

2145.36

2132.77

2139.18

2.98E+09

2139.18

11-10-2016

2161.35

2161.56

2128.84

2136.73

3.44E+09

2136.73

10-10-2016

2160.39

2169.6

2160.39

2163.66

2.92E+09

2163.66

To read the preceding file in R, first save this file in the working directory, and then read it (the name of the file is Sample.csv) using the following code:

>data<-read.csv("Sample.csv") 
>print(data) 

When the preceding code gets executed, it will give the following output:

       Date    Open    High     Low   Close     Volume     Adj.Close 
1  14-10-2016 2139.68 2149.19 2132.98 2132.98 3228150000   2132.98 
2  13-10-2016 2130.26 2138.19 2114.72 2132.55 3580450000   2132.55 
3  12-10-2016 2137.67 2145.36 2132.77 2139.18 2977100000   2139.18 
4  11-10-2016 2161.35 2161.56 2128.84 2136.73 3438270000   2136.73 
5  10-10-2016 2160.39 2169.60 2160.39 2163.66 2916550000   2163.66 

Read.csv by default produces the file in DataFrame format; this can be checked by running the following code:

>print(is.data.frame(data)) 

Now, whatever analysis you want to do, you can perform it by applying various functions on the DataFrame in R, and once you have done the analysis, you can write your desired output file using the following code:

>write.csv(data,"result.csv") 
>output <- read.csv("result.csv") 
>print(output) 

When the preceding code gets executed, it writes the output file in the working directory folder in CSV format.

XLSX

Excel is the most common format of file for storing data, and it ends with extension .xls or .xlsx.

The xlsx package will be used to read or write .xlsx files in the R environment.

Installing the xlsx package has dependency on Java, so Java needs to be installed on the system. The xlsx package can be installed using the following command:

>install.packages("xlsx")

When the previous command gets executed, it will ask for the nearest CRAN mirror, which the user has to select to install the package. We can verify that the package has been installed or not by executing the following command:

>any(grepl("xlsx",installed.packages()))

If it has been installed successfully, it will show the following output:

[1] TRUE
Loading required package: rJava
Loading required package: methods
Loading required package: xlsxjars

We can load the xlsx library by running the following script:

>library("xlsx") 

Now let us save the previous sample file in .xlsx format and read it in the R environment, which can be done by executing the following code:

>data <- read.xlsx("Sample.xlsx", sheetIndex = 1) 
>print(data) 

This gives a DataFrame output with the following content:

       Date    Open    High     Low   Close     Volume    Adj.Close 
1 2016-10-14 2139.68 2149.19 2132.98 2132.98 3228150000   2132.98 
2 2016-10-13 2130.26 2138.19 2114.72 2132.55 3580450000   2132.55 
3 2016-10-12 2137.67 2145.36 2132.77 2139.18 2977100000   2139.18 
4 2016-10-11 2161.35 2161.56 2128.84 2136.73 3438270000   2136.73 
5 2016-10-10 2160.39 2169.60 2160.39 2163.66 2916550000   2163.66 

Similarly, you can write R files in .xlsx format by executing the following code:

>output<-write.xlsx(data,"result.xlsx") 
>output<- read.csv("result.csv") 
>print(output) 

Web data or online sources of data

The Web is one main source of data these days, and we want to directly bring the data from web form to the R environment. R supports this:

URL <- "http://ichart.finance.yahoo.com/table.csv?s=^GSPC" 
snp <- as.data.frame(read.csv(URL)) 
head(snp) 

When the preceding code is executed, it directly brings the data for the S&P500 index into R in DataFrame format. A portion of the data has been displayed by using the head() function here:

        Date    Open    High     Low   Close     Volume   Adj.Close 
1 2016-10-14 2139.68 2149.19 2132.98 2132.98 3228150000   2132.98 
2 2016-10-13 2130.26 2138.19 2114.72 2132.55 3580450000   2132.55 
3 2016-10-12 2137.67 2145.36 2132.77 2139.18 2977100000   2139.18 
4 2016-10-11 2161.35 2161.56 2128.84 2136.73 3438270000   2136.73 
5 2016-10-10 2160.39 2169.60 2160.39 2163.66 2916550000   2163.66 
6 2016-10-07 2164.19 2165.86 2144.85 2153.74 3619890000   2153.74 

Similarly, if we execute the following code, it brings the DJI index data into the R environment: its sample is displayed here:

>URL <- "http://ichart.finance.yahoo.com/table.csv?s=^DJI" 
>dji <- as.data.frame(read.csv(URL)) 
>head(dji) 

This gives the following output:

        Date     Open     High      Low    Close   Volume  Adj.Close 
1 2016-10-14 18177.35 18261.11 18138.38 18138.38 87050000  18138.38 
2 2016-10-13 18088.32 18137.70 17959.95 18098.94 83160000  18098.94 
3 2016-10-12 18132.63 18193.96 18082.09 18144.20 72230000  18144.20 
4 2016-10-11 18308.43 18312.33 18061.96 18128.66 88610000  18128.66 
5 2016-10-10 18282.95 18399.96 18282.95 18329.04 72110000  18329.04 
6 2016-10-07 18295.35 18319.73 18149.35 18240.49 82680000  18240.49 

Please note that we will be mostly using the snp and dji indexes for example illustrations in the rest of the book and these will be referred to as snp and dji.

Databases

A relational database stores data in normalized format, and to perform statistical analysis, we need to write complex and advance queries. But R can connect to various relational databases such as MySQL Oracle, and SQL Server, easily and convert the data tables into DataFrames. Once the data is in DataFrame format, doing statistical analysis is easy to perform using all the available functions and packages.

In this section, we will take the example of MySQL as reference.

R has a built-in package, RMySQL , which provides connectivity with the database; it can be installed using the following command:

>install.packages("RMySQL")

Once the package is installed, we can create a connection object to create a connection with the database. It takes username, password, database name, and localhost name as input. We can give our inputs and use the following command to connect with the required database:

>mysqlconnection = dbConnect(MySQL(), user = '...', password = '...', dbname = '..',host = '.....')

When the database is connected, we can list the table that is present in the database by executing the following command:

>dbListTables(mysqlconnection)

We can query the database using the function dbSendQuery(), and the result is returned to R by using function fetch(). Then the output is stored in DataFrame format:

>result = dbSendQuery(mysqlconnection, "select * from <table name>") 
>data.frame = fetch(result) 
>print(data.fame) 

When the previous code gets executed, it returns the required output.

We can query with a filter clause, update rows in database tables, insert data into a database table, create tables, drop tables, and so on by sending queries through dbSendQuery().

 

How to write code expressions


In this section, we will discuss how to write various basic expressions which are the core elements of writing a program. Later, we will discuss how to create user-defined functions.

Expressions

R code consists of one or more expressions. An expression is an instruction to perform a particular task.

For example, the addition of two numbers is given by the following expression:

>4+5 

It gives the following output:

[1] 9 

If there is more than one expression in a program, they get executed one by one, in the sequence they appear.

Now we will discuss basic types of expressions.

Constant expression

The simplest form of expression are constant values, which may be character or numeric values.

For example, 100 is a numeric value expression of a constant value.

Hello World is a character form expression of a constant expression.

Arithmetic expression

The R language has standard arithmetic operators and using these, arithmetic expressions can be written.

R has the following arithmetic operators:

Operands

Operators

+

Addition

-

Subtraction

*

Multiplication

/

Division

^

Exponentiation

Using these arithmetic operations, one can generate arithmetic expressions; for example:

4+5 
4-5 
4*5 

R follows the BODMAS rule. One can use parentheses to avoid ambiguity in creating any arithmetic expression.

Conditional expression

A conditional expression compares two values and returns a logical value in the form of True or False.

R has standard operators for comparing values and operators for combining conditions:

Operands

Operators

==

Equality

>(>=)

Greater than (greater than equal to)

<(<=)

Less than (less than equal to)

!=

Inequality

&&

Logical AND

||

Logical OR

!

Logical NOT

For example:

10>5, when executed, returns True.

5>10, when executed, returns False.

Functional call expression

The most common and useful type of R expression is calling functions. There are a lot of built-in functions in R, and users can built their own functions. In this section, we will see the basic structure of calling a function.

A function call consists of a function name followed by parentheses. Within the parentheses, arguments are present, separated by commas. Arguments are expressions that provide the necessary information to the functions to perform the required tasks. An example will be provided when we discuss how to construct user-defined functions.

Symbols and assignments

R code consists of keywords and symbols.

A symbol is the label for an object stored in RAM, and it gets the stored value from the memory when the program gets executed.

R also stores many predefined values for predefined symbols, which is used in the program as required and gets automatically downloaded.

For example, the date() function produces today's date when executed.

The result of an expression can be assigned to a symbol, and it is assigned by using the assignment operator <-.

For example, the expression value <-4+6 assigns the symbol value with value 10 and is stored in memory.

Keywords

Some symbols are used to represent special values and cannot be reassigned:

  • NA: This is used to define missing or unknown values

  • Inf: This is used to represent infinity. For example, 1/0 produces the result infinity

  • NaN: This is used to define the result of arithmetic expression which is undefined. For example, 0/0 produces NaN

  • NULL: This is used to represent empty result

  • TRUE and FALSE: These are logical values and are generally generated when values are compared

Naming variables

When writing R code, we need to store various pieces of information under many symbols. So we need to name these symbols meaningfully as that will make the code easy to understand. Symbols should be self-explanatory. Writing short symbol name will make the code tougher to understand.

For example, if we represent date of birth information by DateOfBirth or DOB, then the first option is better as it is self-explanatory.

 

Functions


In this section, we will provide some examples of built-in functions that already exist in R and also construct a user-defined function for a specific task.

A function is a collection of statements put together to do a specific task.

R has a lot of built-in functions and users can define their own functions.

According to their requirement, in R, the interpreter passes control to the function object along with the arguments required for the accomplishment of the task designated for the function. After completing the task, the function returns the control to the interpreter.

The syntax for defining a function is as follows:

>function_name<-function(arg1, arg2,...){ 
>+function body 
>+} 

Here:

  • Function name: This is the name of the defined function and is stored as an object with this name.

  • Arguments: Arguments are the required information needed for the function to accomplish its task. Arguments are optional.

  • Function body: This is a collection of statements that does the designated task for the function.

  • Return value: The return value is the last expression of a function which is returned as an output value of the task performed by the function.

Please find here an example of some of the inbuilt functions along with their results when executed:

>print(mean(25:82)) 
[1] 53.5 
>print(sum(41:68)) 
[1] 1526 

Now we will look at how to build the user-defined functions. Here we are trying to find the square of a given sequence.

The name of the function is findingSqrFunc and takes the argument value, which must be an integer:

>findingSqrFunc<-function(value){ 
>+for(j in 1:value){ 
>+sqr<-j^2 
>+print(sqr) 
>+} 
>+} 

Once the preceding code gets executed, we call the function:

>findingSqrFunc(4) 

We get the following output:

[1] 1 
[1] 4 
[1] 9 
[1] 16 

Calling a function without an argument

Construct a function without an argument:

>Function_test<-function(){ 
>+ for(i in 1:3){ 
>+ print(i*5) 
>+ } 
>+ } 
>Function_test() 

On executing the preceding function without arguments, the following output gets printed:

[1] 5 
[1] 10 
[1] 15 

Calling a function with an argument

The arguments to a function can be supplied in the same sequence as the way it has been defined. Otherwise the arguments have to be given in any order but assigned to their name. Given here are the steps for creating and calling the functions:

  1. First create a function:

            >Function_test<-function(a,b,c){ 
            >+ result<-a*b+c 
            >+ print(result) 
            >+ } 
    
  2. Call the function by providing the arguments in the same sequence. It gives the following output:

            >Function_test(2,3,4) 
            [1] 10 
    
  3. Call the function by names of arguments in any sequence:

            >Function_test(c=4,b=3,a=4) 
    

This gives the following output:

[1] 16 

How to execute R programs

In this section, we will discuss different ways of executing R programs.

How to run a saved file through R Window

For running a program in the R workspace, follow these steps:

  1. Open R (double-click on the desktop icon or open the program from Start).

  2. Click on File and open the script.

  3. Select the program you want to run; it will appear in an R Editor window.

  4. Right-click and Select All (or type Ctrl + A).

  5. Right-click and Run Line or Selection (or type Ctrl + R).

  6. The output will appear in the R console window.

How to source R script

Please perform the following steps for sourcing the R code:

  1. First check your working directory. It can be checked by the following code:

            >print(getwd()) 
    
  2. On running the preceding code, if it gives the path of the designated folder, it is fine. Otherwise, change the working directory by using the following code:

            >setwd("D:/Rcode")  
    
  3. Change the destination directory according to your need and then run the required code using the following code:

            >Source('firstprogram.r') 
    

For example, let's say the program firstprogram.r has the following code in it:

  a<-5 
print(a) 

Upon sourcing, it will generate the output 5 at the console.

When you want to tell R to execute a number of lines of code without waiting for instructions, you can use the source function to run the saved script. This is known as sourcing a script.

It's better to write the entire code in Studio Editor and then save it and source the entire script. If you want to print an output in source script then please use the print function to get the desired output. However, in the interactive editor, you do not need to write print. It will give it by default.

In other operating systems, the command for running the program remains the same.

Comments are parts of a program that are ignored by the interpreter while executing the actual program.

Comments are written using #; for example:

#this is comment in my program. 
 

Loops (for, while, if, and if...else)


Loops are instructions for automating a multistep process by organizing sequences of actions by grouping the parts which need to be repeated. All the programming languages come up with built-in constructs, which allow the repetition of instructions or blocks of instructions. In programming languages, there are two types of loops.

Decision-making is one of the significant components of programming languages. This can be achieved in R programming by using the conditional statement if...else. The syntax, along with an example, is given here.

Let us first discuss if and else conditional statements and then we will discuss loops.

if statement

Let us first see how if and else work in R. The general syntax for an if clause is given here:

if (expression) { 
   statement 
} 

If an expression is correct then the statement gets executed else nothing happens. An expression can be a logical or numeric vector. In the case of numeric vectors, 0 is taken as False and the rest are taken as True, for example:

>x<-5 
>if(x>0) 
>+ { 
>+ print(" I am Positive") 
>+ } 

When the preceding code gets executed then it prints I am Positive.

if...else statement

Now let us see how the if and else conditions work in R. Here is the syntax:

if(expression){ 
   statement1 
} else { 
   statement2 
} 

The else part is evaluated in case if the if part is False, for example:

> x<--5 
> if(x>0) 
>+ { 
>+ print(" I am Positive") 
>+ }else 
>+{ 
>+ print(" I am Negative") 
>+} 

When the preceding code gets executed, it prints I am Negative.

for loop

These loops are executed for a defined number of times and are controlled by a counter or index and incremented at each cycle. Please find here the syntax of the for loop construct:

for (val in sequence) { 
    statement 
} 

Here is an example:

>Var <- c(3,6,8,9,11,16) 
>counter <- 0 
>for (val in Var) { 
>+    if(val %% 2 != 0)  counter = counter+1 
>+} 
print(counter) 

When the preceding code gets executed, it counts the number of odd numbers present in vector c, that is, 3.

while loop

while loops are the loops which are set at onset for verifying the logical condition. The logical condition is tested at the start of the loop construct. Here is the syntax:

while (expression) { 
   statement 
} 

Here, the expression is evaluated first and, if it is true, the body of the for loop gets executed. Here is an example:

>Var <- c("Hello") 
>counter <- 4 
>while (counter < 7) { 
>+   print(Var) 
>+   counter = counter+ 1 
>+} 

Here, first the expression gets evaluated and, if it is true, the body of the loop gets executed and it keeps executing till the expression returns False.

apply()

apply() is a function in R used for quick operations on a matrix, vector, or array and can be executed on rows, columns, and on both together. Now let us try to find the sum of rows of a matrix using the apply function. Let us execute the following code:

> sample = matrix(c(1:10), nrow = 5 , ncol = 2) 
> apply(sample, 1,sum) 

It generates the sum row-wise.

sapply()

sapply() operates over a set of data such as a list or vector, and calls the specified function for each item. Let us execute the following code to check the example:

> sapply(1:5, function(x) x^3) 

It computes cubes for 1 to 5.

 

Loop control statements


There are control statements that can change the normal sequence of execution. break and next are loop control statements, and we will briefly discuss these control statements here.

break

break terminates the loop and gives control to the next following statement of the loop; for example:

>Vec <- c("Hello") 
>counter <- 5 
>repeat { 
>+   print(Vec) 
>+   counter <- counter + 1 
>+   if(counter > 8) { 
>+      break 
>+   } 
>+} 

As a result of the break statement, when the preceding statement gets executed, it prints Hello four times and then leaves the loop. repeat is another loop construct that keeps executing unless a stop condition is specified.

next

next does not terminate the loop, but skips the current iteration of the flow and goes to the next iteration. See the following example:

>Vec <- c(2,3,4,5,6) 
>for ( i in Vec) { 
>+   if (i == 4) { 
>+      next 
>+   } 
>+   print(i) 
>+} 

In the preceding example, when the iteration goes to the third element of vector Vec, then the control skips the current iteration and goes back to the next iteration. So, when the preceding statement gets executed, it prints vector elements 2, 3, 5, and 6, and skips 4.

 

Questions


  1. What are the various atomic objects of R?

  2. What is a vector in R?

  3. What is the difference between a vector and a list?

  4. What is the difference between arrays and matrices?

  5. What is a DataFrame and what is its significance in R?

  6. How do you read and write CSV and XLSX files in R?

  7. How do you read and write stock-market data in R?

  8. Explain the process of connecting R with any relational database.

  9. What is a function and what is its significance in R?

  10. What is an assignment operator in R?

  11. How do you call a function in R?

  12. How do you source a script in R?

  13. What is the difference between for and while loops in R?

 

Summary


Now let us recap what we have learned so far in this chapter:

  • How it is very important for analysts pursuing their career in financial analytics to learn R

  • Installation of R and its packages

  • The basic objects in R are character, numeric, integer, complex, and logical

  • Commonly used data types in R are lists, matrices, arrays, factors, and DataFrames

  • Reading files from external data files such as CSV and XLSX, and particularly from online sources and databases in R

  • Writing files to CSV and XLSX from R

  • Writing different types of expression, such as constant, arithmetic, logical, symbols, assignments, and so on

  • Write user-defined functions

  • Ways of calling of user defined functions and inbuilt functions

  • Running R programs from the console window and by sourcing saved files

  • The use of conditional decision-making by using if and else statements

  • The use of loops such as for and while

About the Authors
  • Dr. Param Jeet

    Dr. Param Jeet is a Ph.D. in mathematics from one of India's leading technological institute in Madras (IITM), India. Dr. Param Jeet has a couple of mathematical research papers published in various international journals. Dr. Param Jeet has been into the analytics industry for the last few years and has worked with various leading multinational companies as well as consulted few of companies as a data scientist.

    Browse publications by this author
  • PRASHANT VATS

    Prashant Vats is a masters in mathematics from one of India's leading technological institute, IIT Mumbai. Prashant has been into analytics industry for more than 10 years and has worked with various leading multinational companies as well as consulted few of companies as data scientist across several domain.

    Browse publications by this author
Latest Reviews (5 reviews total)
Ir is very interesting for me
It seems the exposed contents are the origin (cornerstone) of some very famous softwares in the nearby.
Great book at very affordable price
Learning Quantitative Finance with R
Unlock this book and the full library FREE for 7 days
Start now