Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
The Statistics and Machine Learning with R Workshop

You're reading from  The Statistics and Machine Learning with R Workshop

Product type Book
Published in Oct 2023
Publisher Packt
ISBN-13 9781803240305
Pages 516 pages
Edition 1st Edition
Languages
Author (1):
Liu Peng Liu Peng
Profile icon Liu Peng

Table of Contents (20) Chapters

Preface 1. Part 1:Statistics Essentials
2. Chapter 1: Getting Started with R 3. Chapter 2: Data Processing with dplyr 4. Chapter 3: Intermediate Data Processing 5. Chapter 4: Data Visualization with ggplot2 6. Chapter 5: Exploratory Data Analysis 7. Chapter 6: Effective Reporting with R Markdown 8. Part 2:Fundamentals of Linear Algebra and Calculus in R
9. Chapter 7: Linear Algebra in R 10. Chapter 8: Intermediate Linear Algebra in R 11. Chapter 9: Calculus in R 12. Part 3:Fundamentals of Mathematical Statistics in R
13. Chapter 10: Probability Basics 14. Chapter 11: Statistical Estimation 15. Chapter 12: Linear Regression in R 16. Chapter 13: Logistic Regression in R 17. Chapter 14: Bayesian Statistics 18. Index 19. Other Books You May Enjoy

Control logic in R

Relational and logical operators help compare statements as we add logic to the program. We can also add to the complexity by evaluating multiple conditional statements via loops that repeatedly iterate over a sequence of actions. This section will cover the essential relational and logical operators that form the building blocks of conditional statements.

Relational operators

We briefly covered a few relational operators such as >= and == earlier. This section will provide a detailed walkthrough on the use of standard relational operators. Let’s look at a few examples.

Exercise 1.14 – practicing with standard relational operators

Relational operators allow us to compare two quantities and obtain the single result of the comparison. We will go over the following steps to learn how to express and use standard relational operators in R:

  1. Execute the following evaluations using the equality operator (==) and observe the output:
    >>> 1 == 2
    FALSE
    >>> "statistics" == "calculus"
    FALSE
    >>> TRUE == TRUE
    TRUE
    >>> TRUE == FALSE
    FALSE

    The equality operator performs by strictly evaluating the two input arguments on both sides (including logical data) and only returns TRUE if they are equal.

  2. Execute the same evaluations using the inequality operator (!=) and observe the output:
    >>> 1 != 2
    TRUE
    >>> "statistics" != "calculus"
    TRUE
    >>> TRUE != TRUE
    FALSE
    >>> TRUE != FALSE
    TRUE

    The inequality operator is the exact opposite of the equality operator.

  3. Execute the following evaluations using the greater than and less than operators (> and <) and observe the output:
    >>> 1 < 2
    TRUE
    >>> "statistics" > "calculus"
    TRUE
    >>> TRUE > FALSE
    TRUE

    In the second evaluation, the comparison between character data follows the pairwise alphabetical order of both strings starting from the leftmost character. In this case, the letter s comes after c and is encoded as a higher-valued numeric. In the third example, TRUE is converted into one and FALSE into zero, so returning a logical value of TRUE.

  4. Execute the following evaluations using the greater-than-or-equal-to operator (>=) and less-than-or-equal-to operator (<=) and observe the output:
    >>> 1 >= 2
    FALSE
    >>> 2 <= 2
    TRUE

    Note that these operators consist of two conditional evaluations connected via an OR operator (|). We can, therefore, break it down into two evaluations in brackets, resulting in the same output as before:

    >>> (1 > 2) | (1 == 2)
    FALSE
    >>> (2 < 2) | (2 == 2)
    TRUE

    The relational operators also apply to vectors, which we encountered earlier, such as row-level filtering to subset a data frame.

  5. Compare vec_a with 1 using the greater-than operator:
    >>> vec_a > 1
    FALSE  TRUE  TRUE

    We would get the same result by separately comparing each element and combining the resulting using c().

Logical operators

A logical operator is used to combine the results of multiple relational operators. There are three basic logical operators in R, including AND (&), OR (|), and NOT (!). The AND operator returns TRUE only if both operands are TRUE, and the OR operator returns TRUE if at least one operand is TRUE. On the other hand, the NOT operator flips the evaluation result to the opposite.

Let’s go through an exercise on the use of these logical operators.

Exercise 1.15 – practicing using standard logical operators

We will start with the AND operator, the most widely used control logic to ensure a specific action only happens if multiple conditions are satisfied at the same time:

  1. Execute the following evaluations using the AND operator and observe the output:
    >>> TRUE & FALSE
    FALSE
    >>> TRUE & TRUE
    TRUE
    >>> FALSE & FALSE
    FALSE
    >>> 1 > 0 & 1 < 2
    TRUE

    The result shows that both conditions need to be satisfied to obtain a TRUE output.

  2. Execute the following evaluations using the OR operator and observe the output:
    >>> TRUE | FALSE
    TRUE
    >>> TRUE | TRUE
    TRUE
    >>> FALSE | FALSE
    FALSE
    >>> 1 < 0 | 1 < 2
    TRUE

    The result shows that the output is TRUE if at least one condition is evaluated as TRUE.

  3. Execute the following evaluations using the NOT operator and observe the output:
    >>> !TRUE
    FALSE
    >>> !FALSE
    TRUE
    >>> !(1<0)
    TRUE

    In the third example, the evaluation is the same as 1 >= 0, which returns TRUE. The NOT operator, therefore, reverses the evaluation result after the exclamation sign.

    These operators can also be used to perform pairwise logical evaluations in vectors.

  4. Execute the following evaluations and observe the output:
    >>> c(TRUE, FALSE) & c(TRUE, TRUE)
    TRUE FALSE
    >>> c(TRUE, FALSE) | c(TRUE, TRUE)
    TRUE TRUE
    >>> !c(TRUE, FALSE)
    FALSE  TRUE

There is also a long-form for the AND (&&) and the OR (||) logical operators. Different from the element-wise comparison in the previous short-form, the long-form is used to evaluate only the first element of each input vector, and such evaluation continues only until the result is determined. In other words, the long-form only returns a single result when evaluating two vectors of multiple elements. It is most widely used in modern R programming control flow, especially in the conditional if statement.

Let’s look at the following example:

>>> c(TRUE, FALSE) && c(FALSE, TRUE)
FALSE
>>> c(TRUE, FALSE) || c(FALSE, TRUE)
TRUE

Both evaluations are based on the first element of each vector. That is, the second element of each vector is ignored in both evaluations. This offers computational benefit, especially when the vectors are large. Since there is no point in continuing the evaluation if the final result can be obtained by evaluating the first element, we can safely discard the rest.

In the first evaluation using &&, comparing the first element of the two vectors (TRUE and FALSE) returns FALSE, while continuing the comparison of the second element will also return FALSE, so the second comparison is unnecessary. In the second evaluation using ||, comparing the first element (TRUE | FALSE) gives TRUE, saving the need to make the second comparison, as the result will always be evaluated as TRUE.

Conditional statements

A conditional statement, or more specifically, the if-else statement, is used to combine the result of multiple logical operators and decide the flow of follow-up actions. It is commonly used to increase the complexity of large R programs. The if-else statement follows a general structure as follows, where the evaluation condition is first validated. If the validation returns TRUE, the expression within the curve braces of the if clause would be executed and the rest of the code is ignored. Otherwise, the expression within the else clause would be executed:

if(evaluation condition){
some expression
} else {
other expression
}

Let’s go through an exercise to see how to use the if-else control statement.

Exercise 1.16 – practicing using the conditional statement

Time for another exercise! Let’s practice using the conditional statement:

  1. Initialize an x variable with a value of 1 and write an if-else condition to determine the output message. Print out "positive" if x is greater than zero, and "not positive" otherwise:
    >>> x = 1
    >>> if(x > 0){
    >>>	print("positive")
    >>> } else {
    >>> 	print("not positive")
    >>> }
    "positive"

    The condition within the if clause evaluates to be TRUE, and the code inside is executed, printing out "positive" in the console. Note that the else branch is optional and can be removed if we only intend to place one check to the input. Additional if-else control can also be embedded within a branch.

    We can also add additional branches using the if-else conditional control statement, where the middle part can be repeated multiple times.

  2. Initialize an x variable with 0 and write a control flow to determine and print out its sign:
    >>> x = 0
    >>> if(x > 0){
    >>>  print("positive")
    >>> } else if(x == 0){
    >>>  print("zero")
    >>> } else {
    >>>  print("negative")
    >>> }
    "zero"

    As the conditions are sequentially evaluated, the second statement returns TRUE and so prints out "zero".

Loops

A loop is similar to the if statement; the codes will only be executed if the condition evaluates to be TRUE. The only difference is that a loop will continue to iteratively execute the code as long as the condition is TRUE. There are two types of loops: the while loop and the for loop. The while loop is used when the number of iterations is unknown, and the termination relies on either the evaluation condition or a separated condition within the running expression using the break control statement. The for loop is used when the number of iterations is known.

The while loop follows a general structure as follows, where condition 1 first gets evaluated to determine the expression within the outer curly braces that should be executed. There is an (optional) if statement to decide whether the while loop needs to be terminated based on condition 2. These two conditions control the termination of the while loop, which exits the execution as long as any one condition evaluates as TRUE. Inside the if clause, condition 2 can be placed anywhere within the while block:

while(condition 1){
some expression
if(condition 2){
        break
}
}

Note that condition 1 within the while statement needs to be FALSE at some point; otherwise, the loop will continue indefinitely, which may cause a session expiry error within RStudio.

Let’s go through an exercise to look at how to use the while loop.

Exercise 1.17 – practicing the while loop

Let’s try out the while loop:

  1. Initialize an x variable with a value of 2 and write a while loop. If x is less than 10, square it and print out its value:
    >>> x = 2
    >>> while(x < 10){
    >>>   x = x^2
    >>>   print(x)
    >>> }
    4
    16

    The while loop is executed twice, bringing the value of x from 2 to 16. During the third evaluation, x is above 10 and the conditional statement evaluates to be FALSE, thus exiting the loop. We can also print out x to double-check its value:

    >>> x
    16
  2. Add a condition after the squaring to exit the loop if x is greater than 10:
    >>> x = 2
    >>> while(x < 10){
    >>>   x = x^2
    >>>   if(x > 10){
    >>>     break
    >>>  }
    >>>   print(x)
    >>> }
    4

    Only one number is printed out this time. The reason is that when x is changed to 16, the if condition evaluates to be TRUE, thus triggering the break statement to exit the while loop and ignore the print() statement. Let’s verify the value of x:

    >>> x
    16

Let’s look at the for loop, which assumes the following general structure. Here, var is a placement to sequentially reference the contents in sequence, which can be a vector, a list, or another data structure:

for(var in sequence){
some expression
}

The same expression will be evaluated for each unique variable in sequence, unless an explicit if condition is triggered to either exit the loop using break, or skip the rest of the code and immediately jump to the next iteration using next. Let’s go through an exercise to put these in perspective.

Exercise 1.18 – practicing using the for loop

Next, let’s try the for loop:

  1. Create a vector to store three strings (statistics, and, and calculus) and print out each element:
    >>> string_a = c("statistics","and","calculus")
    >>> for(i in string_a){
    >>>   print(i)
    >>> }
    "statistics"
    "and"
    "calculus"

    Here, the for loop iterates through each element in the string_a vector by sequentially assigning the element value to the i variable at each iteration. We can also choose to iterate using the vector index, as follows:

    >>> for(i in 1:length(string_a)){
    >>>   print(string_a[i])
    >>> }
    "statistics"
    "and"
    "calculus"

    Here, we created a series of integer indexes from 1 up to the length of the vector and assigned them to the i variable in each iteration, which is then used to reference the element in the string_a vector. This is a more flexible and versatile way of referencing elements in a vector since we can also use the same index to reference other vectors. Directly referencing the element as in the previous approach is more concise and readable. However, it lacks the level of control and flexibility without the looping index.

  2. Add a condition to break the loop if the current element is "and":
    >>> for(i in string_a){
    >>>   if(i == "and"){
    >>>     break
    >>>   }
    >>>   print(i)
    >>> }
    "statistics"

    The loop is exited upon satisfying the if condition when the current value in i is "and".

  3. Add a condition to jump to the next iteration if the current element is "and":
    >>> for(i in string_a){
    >>>   if(i == "and"){
    >>>    next
    >>>   }
    >>>  print(i)
    >>> }
    "statistics"
    "calculus"

    When the next statement is evaluated, the following print() function is ignored, and the program jumps to the next iteration, printing only "statistics" and "calculus" with the "and" element.

So far, we have covered some of the most fundamental building blocks in R. We are now ready to come to the last and most widely used building block: functions.

You have been reading a chapter from
The Statistics and Machine Learning with R Workshop
Published in: Oct 2023 Publisher: Packt ISBN-13: 9781803240305
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime}