Chapter 1: Quick Introduction to Rust
Rust is growing in popularity, but it is described as having a steep learning curve. By covering the basic rules of Rust, as well as how to manipulate a range of data types and variables, we will be able to write simple programs in the same fashion as dynamically typed languages with close to the same lines of code.
In this chapter, we will cover the main differences between Rust and generic dynamic languages to provide you with a quick understanding of how to utilize Rust. Installation and project management will be covered in the next chapter. Therefore, it's advised that you code the examples covered in this chapter using the online Rust playground.
In this chapter, we will cover the following topics:
- Reviewing data types and variables in Rust
- Controlling variable ownership
- Building structs
- Metaprogramming with macros
Let's get started!
Technical requirements
For this chapter, we only need access to the internet as we will be using the online Rust playground to implement all the code. The code examples provided can be run in the online Rust playground at https://play.rust-lang.org/.
For detailed instructions, please refer to the README file at https://github.com/PacktPublishing/Rust-Web-Programming/tree/master/Chapter01. You will also find all the source code used in this chapter at the preceding link.
The CiA videos for this book can be viewed at: http://bit.ly/3jULCrw
Reviewing data types and variables in Rust
If you have coded in another language, you will have used these data types already. However, Rust has some quirks that can throw developers, especially if they come from dynamic languages. In order to see the motivation behind these quirks, it's important that we explore why Rust is such a paradigm-shifting language.
Why Rust?
With programming, there is usually a trade-off between speed/resources and development speed/safety. Low-level languages such as C/C++ can give the developer fine-grained control over the computer with fast code execution and minimal resource consumption. However, this is not free. Manual memory management can induce bugs and security vulnerabilities. On top of this, it takes more code and time to solve a problem in a low-level language. As a result of this, C++ web frameworks do not take up a large share of web development. Instead, it made sense to go for high-level programming languages where developers can solve problems safely and quickly.
However, it has to be noted that this memory safety comes at a cost. Languages such as Python, JavaScript, PHP, and Java keep track of all the variables defined and their references to a memory address. When there are no more variables pointing to a memory address, the data in that memory address gets deleted. This process is called garbage collection and consumes extra resources and time.
With Rust, memory safety is ensured without the costly garbage collection process. Instead, the compiler maps the variables, enforcing rules to ensure safety via a mechanism called the borrow checker. Because of this, Rust has enabled rapid, safe problem solving with truly performant code, thus breaking the speed/safety trade-off. As more data processing, traffic, and complex tasks are lifted into the web stack, Rust, with its growing number of web frameworks and libraries, has now become a viable choice for web development.
Before we get into developing a web app in Rust, we're going to briefly cover the basics of Rust. All of the code examples provided can be run in the online Rust playground at https://play.rust-lang.org/.
In the Rust playground, you may have the following layout:
fn main() { println!("Hello, world!"); }
The main
function is the entry point where the code is run. If you're coming from a JavaScript or PHP background, your entry point is the first line of the file that is directly run, and the whole code block is essentially a main
function. This is also true of Python; however, a closer analogy would be the main
block that would be run if the file is directly run by the interpreter:
if __name__ == "__main__": print("Hello, World!")
This is often used to define an entry point in something such as a Flask application.
Using strings in Rust
Rust, like other languages, has typical data formats such as strings, integers, floats, arrays, and hash maps (dictionaries). However, because of the way in which Rust manages memory, there are some quirks we have to look out for when using them. These quirks can be easily understood and handled but can trip up experienced developers from dynamic languages if they are not warned about them.
In this section, we will cover enough memory management that we can start defining and using various data types and variables. We will dive into the concepts of memory management in more detail in the Controlling variable ownership section, later in this chapter.
We will start off with strings. We can create our own print
function that accepts a string and prints it:
fn print(input_string: String) { println!("{}", input_string); } fn main() { let test_string = String::from("Hello, World!"); print(test_string); }
Here, we defined a string using the from
function in the String
object, and then passed it through our own print
function to print it using Rust's built-in println!
function. (Technically, this is a macro;!
denotes that we can put multiple parameters inside the parentheses. We will cover macros later.)
Notice that the print
function expects the String
object to be passed through. This is the minimum amount of typing that's needed for a function. Now, we can try something a bit more familiar for a dynamic language. We don't call a String
object function; we just define the string using quotation marks:
fn print(input_string: str) { println!("{}", input_string); } fn main() { let test_string = "Hello, World!"; print(test_string); }
What we have done here is defined a string literal and passed it through the print
function to be printed. However, we get the following error:
error[E0277]: the size for values of type `str` cannot be known at compilation time
In order to understand this, we have to have a high-level understanding of stack and heap memory.
Stack memory is fast, static, and allocated at compile time. Heap memory is slower and allocated at runtime. String literals can vary in size as they are the string data that we refer to. String objects, on the other hand, have a fixed size in the stack that consists of a reference to the string literal in the heap, the capacity of the string literal, and the length of the string literal. When we pass a string literal through our own print
function, it will have no idea of the size of the string literal being passed through. String literals can be converted into strings with to_string
:
fn print(input_string: String) { println!("{}", input_string); } fn main() { let test_string = "Hello, World!"; print(test_string.to_string()); }
Here, we converted the string literal just before passing it through the print
function. We can also get the print
function to accept a string literal reference by borrowing it using the &
operator:
fn print(input_string: &str) { println!("{}", input_string); } fn main() { let test_string = &"Hello, World!"; print(test_string); }
Borrowing will be covered later in this chapter. What is essentially happening here is that test_string
is merely a reference to the string literal, which is then passed through to the print
function. One last thing we must note about strings is that we can get the string literal from the string with the as_str
method.
Understanding integers and floats
Rust has signed integers (denoted by i
) and unsigned integers (denoted by u
) that consist of 8, 16, 32, 64, and 128 bits. The math behind binary notation is not relevant for the scope of this book. What we do need to understand, though, is the range of numbers allowed in terms of bits. Because binary is either 0 or 1, we can calculate the integer range by raising two to the power of the number of bits. For example, for 8 bits, 2 to the power of 8 equates to 256. Considering the 0, this means that an i8
integer should have a range of 0 to 255, which can be tested by using the following code:
let number: u8 = 255;
Let's take a look at the following code:
let number: u8 = 256;
It's not surprising that the preceding code gives us the following overflow error:
literal `256` does not fit into the type `u8` whose range is `0..=255`
What's not expected is if we change it to a signed integer:
let number: i8 = 255;
Here, we get the following error:
literal `255` does not fit into the type `i8` whose range is `-128..=127`
This is because unsigned integers only house positive integers and signed integers house positive and negative integers. Since bits are memory size, the signed integer has to accommodate a range on both sides of zero, so the modulus of the signed integers is essentially half.
In terms of floats, Rust accommodates f32
and f64
floating points, which can be both negative and positive. Declaring a floating-point variable requires the same syntax as integers:
let float: f32 = 20.6;
It has to be noted that we can also annotate numbers with suffixes, as shown in the following code:
let x = 1u8;
Here, x
has a value of 1
with the type of u8
. Now that we have covered floats and integers, we can use vectors and arrays to store them.
Storing data in vectors and arrays
Rust stores sequenced data in vectors and arrays. Arrays are generally immutable and don't have push functions (append for Python). They also only accommodate one data type. This can be managed using structs and traits, but this will be covered later on in this chapter. You can define and loop through arrays and vectors with fairly standard syntax:
let int_array: [i32; 3] = [1, 2, 3]; for i in int_array.iter() { println!("{}", i); } let str_vector: Vec<&str> = vec!["one", "two", "three"]; for i in str_vector.iter() { println!("{}", i); } let second_int_array: [i32; 3] = [1, 2, 3]; let two = second_int_array[1];
Let's try and append "four"
to our str_vector
:
str_vector.push("four");
Here, we get an error about how we cannot borrow as mutable. This is because, by default, variables defined in Rust are not mutable. This can be easily remedied by putting a mut
keyword in front of the variable's name:
let mut str_vector: Vec<&str> = vec!["one", "two", "three"];
This also works for strings and numbers. While it might be tempting to define everything as a mut
variable, this forced immutability not only has performance benefits, but it also improves the safety. If you are not expecting a variable to change in a complex system, then not allowing it to mutate will throw up the error right then as opposed to allowing silent bugs to run in your system.
Mapping data with hash maps
In some languages, hash maps are referred to as dictionaries. In order to define a hash map in Rust, we must import the hash maps from the standard library. Once we've defined a new hash map, we can insert an entry, get it out of the hash map, and then print it:
use std::collections::HashMap; fn main() { let mut general_map: HashMap<&str, i8> = HashMap::new(); general_map.insert("test", 25); let outcome: i8 = general_map.get("test"); println!("{}", outcome); }
With this, we get the following error for defining the outcome variable:
expected `i8`, found enum `std::option::Option`
Here, we can see that the get
method does not actually return an i8
type, despite us inserting an i8
type into the hash map. It's returning an Option
enum instead. This is because the get
method could fail. We could pass in a key that does not exist. Therefore, we have to unwrap the option to get the value we're aiming to get:
let outcome: Option<&i8> = general_map.get("test"); println!("here is the outcome {}", outcome.unwrap());
However, directly unwrapping the result can result in an error being raised. Because Optional
is either Some
or None
, we can exploit Rust's match
statement to handle the outcome:
match general_map.get("test") { None => println!("it failed"), Some(result) => println!("Here is the result: {}", result) }
Here, if the result is None
, then we print that it failed. If the result is Some
, we access the result in the Optional
wrapper and print it. The arrows in the match
statement can have their own code blocks. For instance, we can nest a match
statement within a match
statement. For instance, we can perform another lookup if the original lookup fails. In the following code, we can check to see if there's an entry under the "testing"
key. If it's not there, we can then check to see if there's an entry under the "test"
key. If that fails too, we must give up:
match general_map.get("testing") { None => { match general_map.get("test") { None => println!("Both testing and test failed"), Some(result) => println!("testing failed but test is: {}", result) } }, Some(result) => println!("Here is the result: {}", result) }
Calling the insert
function again with the same key will merely update the value under that key. Calling the remove
function from the hash map with the desired key will remove the entry if it exists. There are some experimental functions such as reserve allocations, capacity, and more that will move to the stable build of Rust in time. Be sure to check the official Rust documentation for more functions for the hash map at https://doc.rust-lang.org/beta/std/collections/struct.HashMap.html.
Crates, tooling, and documentation will be covered in Chapter 2, Designing Your Web Application in Rust. Note that the hash map in this example can only accept i8
integers. We will cover how to enable different data types so that they can be stored with structs later in this chapter.
Handling results and errors
Like other languages, Rust throws and handles errors. It manages errors through two different types: Option
and Result
. We saw Option
in action in the hash map, where we had to unwrap the get
function to access the data in the hash map. Since Option
only returns None
or Some
, Result
returns Err
or Some
.
This is fairly similar, however, if Err
is exposed, as the Rust program panics and the program crashes with what is in the outcome of Err
. While there will be plenty of opportunities to throw errors, we will also want to throw our own when needed. When systems become more complex, it can be handy to purposefully throw errors if there is any undesired behavior. A good example is inserting data into a Redis cache.
Technically, there is nothing stopping us from inserting a range of keys into Redis. In order to prevent this, if the key is not an expected variant of what we want, we should throw an error. Let's demonstrate how to throw an error, depending on the data:
fn error_check(check: bool) -> Result<i8, &'static str> { if check == true { Err("this is an error") } else { Ok(1) } } fn main() { let result: i8 = error_check(false).unwrap(); println!("{}", result); }
Note that there is no return
keyword. This is because the function returns the final expression in the function when there is no semicolon at the end of the expression. In our function, if we set the input to true
, we get the following error:
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: "this is an error"'
This Result
wrapper gives us a lot of control of the outcome. Instead of throwing try
and except
blocks, we can wait until we're ready to handle the error. We can build a simple error handling function with a match
statement:
fn error_check(check: bool) -> Result<i8, &'static str> { if check == true { return Err("this is an error") } else { return Ok(1) } } fn describe_result(result: Result<i8, &'static str>) { match result { Ok(x) => println!("it's a result of: {}", x), Err(x) => println!("{}", x) } } fn main() { let result: Result<i8, &'static str> = error_check(true); describe_result(result); }
In the wild, this comes in useful when we must roll back a database entry or clean up a process before throwing an error. We also have to note the typing for Result
. In this result, we return an i8
integer (we can return other variables), but we can also return a reference to a string literal that has the 'static
notation. This is the lifetime notation. We will cover lifetime notation in more detail later in this chapter, but for now, the 'static
notation is telling the compiler that the error string will stay around for the entire runtime of the program.
This makes sense, as we would hate to lose the error message because we moved out of scope. Also, it's an error, so we should be ending the program soon. If we want to tolerate an outcome, we should be reaching for the option and handling None
. We can also signpost a little more with the expect
function as opposed to using unwrap
. It still unwraps the result, but adds an extra message in the error trace:
let result: i8 = error_check(true).expect("this has been caught");
We can also directly throw errors with the panic
function:
panic!("throwing some error");
We can also check for an error using is_err
:
result.is_err()
This returns a bool
value. As we can see, Rust supports a range of error handling. It is advised to keep these as simple as possible. For most processes in a simple web app, unwrapping straight away and throwing the error as soon as possible will manage most situations.
Now that we can utilize basic data structures while navigating Rust's quirks, we have to address problems around controlling the ownership of these data structures.
Controlling variable ownership
As Rust does not have a garbage collector, it maintains memory safety by enforcing strict rules around variable ownership that are enforced when compiling. These rules can initially bite developers from dynamic languages and lead to frustration, giving Rust its false steep learning curve reputation. However, if these rules are understood early, the helpful compiler makes it straightforward to adhere to them. Rust's compile-time checking is done to protect against the following memory errors:
- Use after frees: This is where memory is accessed once it has been freed, which can cause crashes. It can also allow hackers to execute code via this memory address.
- Dangling pointers: This is where a reference points to a memory address that no longer houses the data that the pointer was referencing. Essentially, this pointer now points to null or random data.
- Double frees: This is where allocated memory is freed, and then freed again. This can cause the program to crash and increases the risk of sensitive data being revealed. This also enables a hacker to execute arbitrary code.
- Segmentation faults: This is where the program tries to access the memory it's not allowed to access.
- Buffer overrun: An example of this is reading off the end of an array. This can cause the program to crash.
Protection is achieved by Rust following ownership rules. These ownership rules flag code that can lead to the memory errors we just mentioned (given as follows). If they are broken, they are flagged up as compile-time errors. These are defined here:
- Values are owned by the variables assigned to them.
- As soon as the variable goes out of scope, it is deallocated from the memory it is occupying.
- Values can be used by other variables, as long as we adhere to the following rules:
- Copy: This is where the value is copied. Once it has been copied, the new variable owns the value, and the existing variable also owns its own value.
- Move: This is where the value is moved from one variable to another. However, unlike clone, the original variable no longer owns the value.
- Immutable borrow: This is where another variable can reference the value of another variable. If the variable that is borrowing the value falls out of scope, the value is not deallocated from memory as the variable borrowing the value does not have ownership.
- Mutable borrow: This is where another variable can reference and write the value of another variable. If the variable that is borrowing the value falls out of scope, the value is not deallocated from memory as the variable borrowing the value does not have ownership.
Considering that scopes play a big role in the ownership rules, we'll explore them in more detail in the next section.
Scopes
The key rule to remember when it comes to ownership in Rust is that when let
is used to create a variable, that variable is the only one that owns the resource. Therefore, if the resource is moved or reassigned, then the initial variable no longer owns the resource.
Once the scope has ended, then the variable and the resource are deleted. A good way to demonstrate this is through scopes. Scopes in Rust are defined by curly brackets. The classic way of demonstrating this is through the following example:
fn main() { let one: String = String::from("one"); { println!("{}", one); let two: String = String::from("two"); } println!("{}", one); println!("{}", two); }
Commenting out the last print
statement will enable the code to run. Keeping it will cause the code to crash due to the fact that two
is created inside a different scope and then deleted when the inner scope ends. We can also see that one
is available in the outer scope and the inside scope. However, it gets interesting when we pass the variable into another function:
fn print_number(number: String) { println!("{}", number); } fn main() { let one: String = String::from("one"); print_number(one); println!("{}", one); }
The error from the preceding code tells us a lot about what's going on:
6 | let one: String = String::from("one"); | --- move occurs because `one` has type `std::string::String`, which does not implement the `Copy` trait 7 | print_number(one); | --- value moved here 8 | println!("{}", one); | ^^^ value borrowed here after move
The stem of the error has occurred because String
does not implement a copy trait. This is not surprising as we know that String
is a type of wrapper implemented as a vector of bytes. This vector holds a reference to str
, the capacity of str
in the heap memory, and the length of str
, as denoted in the following diagram:

Figure 1.1 – String relationship to str
Having multiple references to the value breaks our rules. Passing one
through our print
function moves it into another scope, which is then destroyed. If we passed ownership to a function but still allowed references outside the function later on, these references will be pointing to freed memory, which is unsafe.
The compiler is very helpful in telling us that the variable has been moved, which is why it cannot print it. It also gives us another hint. Here, you can see that the built-in print
method tries to borrow String
. When you borrow a variable, you can access the data, but for only as long as you need it. Borrowing can be done by using the &
operator. Therefore, we can get around this issue with the following code:
fn alter_number(number: &mut String) { number.push("!".chars().next().unwrap()); } fn print_number(number: &String) { println!("{}", number); } fn main() { let mut one: String = String::from("one"); print_number(&one); alter_number(&mut one); println!("{}", one); }
In the preceding code, we borrowed the string to print it. In the second function, we did a mutable borrow, meaning that we can alter the value. We then defined a string literal, converted it into an array of chars, called the next function since it is a generator, and then unwrapped it and appended it to the string. We can see by the final print
statement that the one
variable has been changed.
If we were to try and change the value in the print_number
function, we would get an error because it's not a mutable borrow, despite one
being mutable. When it comes to immutable borrows, we can make as many as we like. For instance, if we are borrowing for a function, the function does not need to own the variable. If there is a mutable borrow, then only one mutable borrow can exist at one time, and during that lifetime, no immutable borrows can be made. This is to avoid data races.
With integers, this is easier as they implement the copy trait. This means that we don't have to borrow when passing the copy trait into a function. It's copied for us. The following code prints an integer and increases it by one:
fn alter_number(number: &mut i8) { *number += 1 } fn print_number(number: i8) { println!("{}", number); } fn main() { let mut one: i8 = 1; print_number(one); alter_number(&mut one); println!("{}", one); }
Here, we can see that the integer isn't moved into print_number
; it's copied. However, we still have to pass a mutable reference if we want to alter the number. We can also see that we've added a *
operator to the number when altering it. This is a dereference. By performing this, we have access to the integer value that we're referencing. Remember that we can directly pass the integer into the print_number
function because we know the maximum size of all i8
integers.
Running through lifetimes
Now that we have borrowing and referencing figured out, we can look into lifetimes. Remember that a borrow is not sole ownership. Because of this, there is a risk that we could reference a variable that's deleted. This can be demonstrated in the following classic demonstration of a lifetime:
fn main() { let one; { let two: i8 = 2; one = &two; } // -----------------------> two lifetime stops here println!("r: {}", one); }
This gives us the following error:
| one = &two; | ^^^^ borrowed value does not live long enough | } | - `two` dropped here while still borrowed | | println!("r: {}", one); | --- borrow later used here
Since the reference is defined in the inner scope, it's deleted at the end of the inner scope, meaning that the end of its lifetime is at the end of the inner scope. However, the lifetime of the one
variable carries on to the end of the scope of the main
function. Therefore, the lifetimes are not equal.
While it is great that this is flagged when compiling, Rust does not stop here. This concept also translates functions. Let's say that we build a function that references two integers, compares them, and returns the highest integer reference. The function is an isolated piece of code. In this function, we can denote the lifetimes of the two integers. This is done by using the '
prefix, which is a lifetime notation. The names of the notations can be anything you wish, but it's a general convention to use a
, b
, c
, and so on. Let's look at an example:
fn get_highest<'a>(first_number: &'a i8, second_number: &'a i8) -> &'a i8 { if first_number > second_number { first_number } else { second_number } } fn main() { let one: i8 = 1; { let two: i8 = 2; let outcome: &i8 = get_highest(&one, &two); println!("{}", outcome); } }
As we can see, the first and second lifetimes have the same notation of a
. They will both have to be present for the duration of the function. We also have to note that the function returns an i8
integer with the lifetime of a
. Therefore, the compiler knows that we cannot rely on the outcome outside the inner scope. However, we may want to just use the two
variable that is defined in the inner scope for reference in the function, but not for the result.
This might be a little convoluted, so to demonstrate this, let's develop a function that checks the one
variable against the two
variable. If one
is lower than two
, then we return zero; otherwise, we return the value of one
:
fn filter<'a, 'b>(first_number: &'a i8, second_number: &'b i8) -> &'a i8 { if first_number < second_number { &0 } else { first_number } } fn main() { let one: i8 = 1; let outcome: &i8; { let two: i8 = 2; outcome = filter(&one, &two); } println!("{}", outcome); }
Here, we assigned the lifetime of 'a
to first_number
, and the lifetime of 'b
to second_number
. Using 'a
and 'b
, we are telling the compiler that the lifetimes are different. We then tell the compiler in the return typing of the function that the function returns an i8
integer with the lifetime of 'a
. Therefore, we can rely on the result of the filter
function, even if the lifetime of second_number
finishes.
If we switch the second_number
lifetime type of 'a
, we get the following expected error:
| outcome = filter(&one, &two); | ^^^^ borrowed value does not live long enough | } | - `two` dropped here while still borrowed | println!("{}", outcome); | ------- borrow later used here
Even though we're still just returning first_number
that is available in the outer scope, we're telling the compiler that we're returning a variable with the 'a
lifetime, which is assigned to first_number
and second_number
. The compiler is going to side with the shortest lifetime to be safe when both lifetimes are denoted to be the same in the function.
Now that we understand the quirks behind data types, borrowing, and lifetimes, we're ready to build our own structs that have the functionality to create a hash map that accepts a range of data types.
Building structs
In dynamic languages, classes have been the bedrock of developing data structures with custom functionality. In terms of Rust, structs enable us to define data structures with functionality. To mimic a class, we can define a Human
struct:
struct Human { name: String, age: i8, current_thought: String } impl Human { fn new(input_name: &str, input_age: i8) -> Human { return Human { name: input_name.to_string(), age: input_age, current_thought: String::from("nothing") } } fn with_thought(mut self, thought: &str ) -> Human { self.current_thought = thought; return self } fn speak(&self) -> () { println!("Hello my name is {} and I'm {} years old.", &self.name, &self.age); } } fn main() { let developer = Human::new("Maxwell Flitton", 31); developer.speak(); println!("currently I'm thinking {}", developer.current_thought); let new_developer = Human::new("Grace", 30).with_thought( String::from("I'm Hungry")); new_developer.speak(); println!("currently I'm thinking {}", new_developer.current_thought); }
This looks very familiar. Here, we have a Human
struct that has name
and age
attributes. The impl
block is associated with the Human
struct. The new
function inside the impl
block is essentially a constructor for the Human
struct. The constructor states that current_thought
is a string that's been initialized with nothing because we want it to be an optional field.
We can define the optional current_thought
field by calling the with_thought
function directly after calling the new
function, which we can see in action when we define new_developer
. Self
is much like self
in Python, and also like this
in JavaScript as it's a reference to the Human
struct.
Now that we understand structs and their functionality, we can revisit hash maps to make them more functional. Here, we will exploit enums
to allow the hash map to accept an integer or a string:
use std::collections::HashMap; enum AllowedData { S(String), I(i8) } struct CustomMap { body: HashMap<String, AllowedData> }
Now that the hash map has been hosted as a body
attribute, we can define our own constructor, get
, insert
, and display
functions:
impl CustomMap { fn new() -> CustomMap { return CustomMap{body: HashMap::new()} } fn get(&self, key: &str) -> &AllowedData { return self.body.get(key).unwrap() } fn insert(&mut self, key: &str, value: AllowedData) -> () { self.body.insert(key.to_string(), value); } fn display(&self, key: &str) -> () { match self.get(key) { AllowedData::I(value) => println!("{}", value), AllowedData::S(value) => println!("{}", value) } } } fn main() { // defining a new hash map let mut map = CustomMap::new(); // inserting two different types of data map.insert("test", AllowedData::I(8)); map.insert("testing", AllowedData::S( "test value".to_string())); // displaying the data map.display("test"); map.display("testing"); }
Now that we can build structs and exploit enums to handle multiple data types, we can tackle more complex problems in Rust. However, as the problem's complexity increases, the chance of repeating code also increases. This is where traits come in.
Verifying with traits
As we can see, enums
can empower our structs so that they can handle multiple types. This can also be translated for any type of function or data structure. However, this can lead to a lot of repetition. Take, for instance, a User Struct. Users have a core set of values, such as a username and password. However, they could also have extra functionality based on roles. With users, we have to check roles before firing certain processes.
We also want to add the same functionality to a number of different user types. We can do this with traits. In this sense, we're going to use traits like a mixin. Here, we will create three traits for a user struct: a trait for editing data, another for creating data, and a final one for deleting data:
trait CanEdit { fn edit(&self) { println!("user is editing"); } } trait CanCreate { fn create(&self) { println!("user is creating"); } } trait CanDelete { fn delete(&self) { println!("user is deleting"); } }
Here, if a struct implements a trait, then it can use and overwrite the functions defined in the trait
block. Next, we can define an admin user struct that implements all three traits:
struct AdminUser { name: String, password: String, } impl CanDelete for AdminUser {} impl CanCreate for AdminUser {} impl CanEdit for AdminUser {}
Now that our user struct has implemented all three traits, we can create a function that only allows users inside that have the CanDelete
trait implemented:
fn delete<T: CanDelete>(user: T) -> () { user.delete(); }
Similar to the lifetime annotation, we use angle brackets before the input definitions to define T
as a CanDelete
trait. If we create a general user struct and we don't implement the CanDelete
trait for it, Rust will fail to compile if we try to pass the general user through the delete
function; it will complain, stating that it does not implement the CanDelete
trait.
Now, with what we know, we can develop a user struct that inherits from a base user struct and has traits that can allow us to use the user struct in different functions. Rust does not directly support inheritance. However, we can combine structs with basic composition:
struct BaseUser { name: String, password: String } struct GeneralUser { super_struct: BaseUser, team: String } impl GeneralUser { fn new(name: String, password: String, team: String) -> GeneralUser { return GeneralUser{super_struct: BaseUser{name, password}, team: team} } } impl CanEdit for GeneralUser {} impl CanCreate for GeneralUser { fn create(&self) -> () { println!("{} is creating under a {} team", self.super_struct.name, self.team); } }
Here, we defined what attributes are needed by a user in the base user struct. We then housed that under the super_struct
attribute for the general user struct. Once we did this, we performed the composition in the constructor function, which is defined as new, and then we implemented two traits for this general user. In the CanCreate
trait, we overwrote the create
function and utilized the team
attribute that was given to the general user.
As we can see, building structs that inherit from base structs is fairly straightforward. These traits enable us to slot in functionality such as mixins, and they go one step further by enabling typing of the struct in functions. Traits get even more powerful than this, and it's advised that you read more about them to enhance your ability to solve problems in Rust.
With what we know about traits, we can reduce code complexity and repetition when solving problems. However, a deeper dive into traits at this point will have diminishing returns when it comes to developing web apps. Another widely used method for structs and processes is macros.
Metaprogramming with macros
Metaprogramming can generally be described as a way in which the program can manipulate itself based on certain instructions. Considering the strong typing Rust has, one of the simplest ways in which we can meta program is by using generics. A classic example of demonstrating generics is through coordinates:
struct Coordinate <T> { x: T, y: T } fn main() { let one = Coordinate{x: 50, y: 50}; let two = Coordinate{x: 500, y: 500}; let three = Coordinate{x: 5.6, y: 5.6}; }
Here, the compiler is looking for all the times where the coordinate struct is called and creates structs with the types that were used when compiling. The main mechanism of metaprogramming in Rust is done with macros. Macros enable us to abstract code. We've already been using macros in our print
functions. The !
notation at the end of the function denotes that this is a macro that's being called. Defining our own macros is a blend of defining a function and using a lifetime notation within a match
statement in the function. In order to demonstrate this, we will define a macro that capitalizes a string:
macro_rules! capitalize { ($a: expr) => { let mut v: Vec<char> = $a.chars().collect(); v[0] = v[0].to_uppercase().nth(0).unwrap(); $a = v.into_iter().collect(); } } fn main() { let mut x = String::from("test"); capitalize!(x); println!("{}", x); }
Instead of using the term fn
, we use the macro_rules!
definition. We then say that $a
is the expression that's passed into the macro. We get the expression, convert it into a vector of chars, uppercase the first char, and then convert it back into a string.
Note that we don't return anything in the capitalize macro and that when we call the macro, we don't assign a variable to it. However, when we print the x
variable at the end, we can see that it is capitalized. This does not behave like an ordinary function. We also have to note that we didn't define a type. Instead, we just said it was an expression; the macro still does checks via traits. Passing an integer into the macro results in the following error:
| capitalize!(32); | ---------------- in this macro invocation | = help: the trait `std::iter::FromIterator<char>` is not implemented for `{integer}`
Lifetimes, blocks, literals, paths, meta, and more can also be passed instead of an expression. While it's important to have a brief understanding of what's under the hood of a basic macro for debugging and further reading, diving more into developing complex macros will not help us when it comes to developing web apps.
We must remember that macros are a last resort and should be used sparingly. Errors that are thrown in macros can be hard to debug. In web development, a lot of the macros are already defined in third-party packages. Because of this, we do not need to write macros ourselves to get a web app up and running. Instead, we will mainly be using derive macros out of the box.
Derive macros can be analogous to decorators in JavaScript and Python. They sit on top of a function or struct and change its functionality. A good way to demonstrate this in action is by revisiting our coordinate struct. Here, we will put it through a print
function we define, and then try and print it again with the built-in print macro:
struct Coordinate { x: i8, y: i8 } fn print(point: Coordinate) { println!("{} {}", point.x, point.y); } fn main() { let test = Coordinate{x: 1, y:2}; print(test); println!("{}", test.x) }
Unsurprisingly, we get the following error when compiling:
| let test = Coordinate{x: 1, y:2}; | ---- move occurs because `test` has type `Coordinate`, which does not implement the `Copy` trait | print(test); | ---- value moved here | println!("{}", test.x) | ^^^^^^ value borrowed here after move
Here, we can see that we're getting the error that the coordinate was moved into our function and was then borrowed later. We can solve this with the &
notation. However, it's also worth noting the second line in the error, stating that our struct does not have a copy trait. Instead of trying to build a copy trait ourselves, we can use a derive macro to give our struct a copy trait:
#[derive(Clone, Copy)] struct Coordinate { x: i8, y: i8 }
Now, the code will run. The copy trait is fired when we move the coordinate into our print
function. We can stack these traits. By merely adding the debug trait to the derive
macro, we can print out the whole struct using the :?
operator in the print macro:
#[derive(Debug, Clone, Copy)] struct Coordinate { x: i8, y: i8 } fn main() { let test = Coordinate{x: 1, y:2}; println!("{:?}", test) }
This gives us a lot of powerful functionality in web development. For instance, we will be using them in JSON serialization using the serde
crate:
use serde::{Serialize, Deserialize}; #[derive(Serialize, Deserialize)] struct Coordinate { x: i8, y: i8 }
With this, we can pass the coordinate into the crate's functions to serialize into JSON, and then deserialize. We can create our own derive macros, but the code behind our own derive macros has to be packaged in its own crate. While we will go over cargo and file structure in the next chapter, we will not be building our own derive macros.
Summary
When it comes to Rust, we saw that there are some traps if you're coming from a dynamic programming language. However, with a little bit of knowledge of referencing and basic memory management, we can avoid common pitfalls and write safe, performant code in a quick fashion that can handle errors. By utilizing structs, composition, and traits, we can build objects that are analogous to classes in standard dynamic programming languages. On top of this, these traits enabled us to build mixin-like functionality that not only enables us to slot in functionality when it's useful to us, but also perform checks on the structs through typing. This ensures that the container or function is processing structs with certain attributes belonging to the trait that we can utilize in the code.
With our fully functioning structs, we bolted on even more functionality with macros and looked under the hood of basic macros by building our own capitalize function, giving us guidance for further reading and debugging. We also got to see a brief demonstration of how powerful macros, when combined with structs, can be in web development with JSON serialization.
With this brief introduction to Rust, we can now move on to the next chapter and look into setting up a Rust environment on our own computers. This will allow us to structure files and code so that we can build programs that can solve real-world problems.
Questions
- What is the difference between
str
andString
? - Why can't string literals be passed through a function (string literal meaning
str
as opposed to&str
)? - How do we access the data belonging to a key in a hash map?
- When a function results in an error, can we handle other processes or will the error crash the program instantly?
- When borrowing, how does Rust ensure that there's no data race?
- When would we need to define two different lifetimes in a function?
- How can structs utilize inheritance?
- How can we slot in extra functionality and freedom into a struct?
- How do we allow a container or function to accept different data structures?
- What's the quickest way to add a trait, such as
copy
, to a struct?
Further reading
- Hands-On Functional Programming in Rust (2018) by Andrew Johnson, Packt Publishing
- Mastering Rust (2019) by Rahul Sharma and Vesa Kaihlavirta, Packt Publishing
- The Rust Programming Language (2018): https://doc.rust-lang.org/stable/book/