Speaking about complexity implies that it's difficult to understand something through simple reasoning. For example, it's hard to predict the result of a program's execution or piece of code that uses various recursion instances, because it's hard to figure out what the result will be without doing a more detailed analysis of the process.
Often, this complexity is inherent to the problem we're trying to solve, which we call essential complexity since we cannot dispense it. A recurrent exercise in mathematics and physics, which has allowed us to make significant advances, is to simplify complex models by eliminating sources of complexity that aren't inherent to the phenomenon we're trying to describe. The problem in software development is that this complexity is an essential part of the problem that's being solved, and it's not so easy to simplify the problem.
Although this complexity is not unique to software, it can be harder to show. For example, if we're constructing a building, which is a structure with plenty of complexity, it's easier to show what part of the structure isn't correctly aligned, a dimension that doesn't comply with the blueprints, or rather that the omissions are pretty obvious. The invisible nature of software means that we need to put in more effort to try and identify the cause of an error in a complex system.
One of the most critical advances in the software industry has been the development of high-level languages that simplify many of the possible causes of error due to complexity. This was achieved by encapsulating problems in abstract structures, hierarchies, and modules, thus allowing programmers to improve their productivity and the comprehension of their written code. This also enabled them to focus on solving the core of the problem and not deal with disk access, memory registers, and precise details surrounding the architecture or the hardware.
The models these languages have been built on carry some sources of complexity that Mark Marron identified in his work, and that Bosque pretends to avoid from the design stage of the new programming language.
Now, let's understand each of these sources of complexity and how Bosque avoids them in its design.
Immutability
Mutability can be more intuitive at a higher level but harder to comprehend in detail, besides making analysis tools harder because it affects the application's state. On the other hand, it implies having to compute logical frames. This could be an arduous task for developers as they must remember the state changes for each entity in their applications.
Mutability is an object trait that allows them to be changed. This concept could be intuitive at first glance because we are used to thinking about transforming things through processes, especially if we have spent a lot of time programming with OOP.
However, this adds an extra level of complexity since we must be aware of all the changes our objects will undergo during the entire execution process.
Additionally, if we consider that our code uses methods and functions that have been provided by external libraries or language helpers, which can change our objects or entities' states, then all the events that modify an object's state become arduous. Consequently, we have less predictable programs.
Let's look at an example to understand this better.
Let's say we have the following code in JavaScript, which describes an object – Charles
, in this case:
let person = {
name: "Charles"
}
Now, let's create a function that assigns a new value to save this person's age:
function setAgeToPerson(person, age) {
person.age = age;
}
Now we are able to add the age property to person, calling to the function setAgeToPerson():
setAgeToPerson(person, 21);
So, if we query for the properties of person
, we will have age and name.
However, let's imagine that a third-party library provides the setAgeToPerson()
method. If we had previously assigned a value to age, we could incur an error that could be difficult to identify since we are changing a property with the same name from different places.
This can become a problem if we have many events throughout the code that alter the state or composition of an object, which is even more complex if we consider asynchronous executions or dynamic dispatching.
Due to this and the Bosque philosophy of eliminating sources of complexity, immutability is adopted for all the values throughout the language.
Contrary to mutability, immutability is the concept that objects cannot change state once they've been created. This concept may be familiar to you if you have had experience with functional languages.
Other than improving the predictability of the code and minimizing errors, immutability has many benefits when it's time to understand the code that's been written. The following are some other benefits of immutability:
- Easy-to-test code
- Thread-safe by design
- No invalid states
- Better encapsulation
In summary, immutability in Bosque allows our applications to be more predictable, secure, and stable, thus allowing us to clearly identify the state of our application through the code that's written. It doesn't let hidden code make unexpected changes.
Loop-free
Loops are part of many junior programmers' training, so it is common to think about them when we're solving problems. However, we may create errors when we use them, which could be avoided.
As we know, when we use loops, we must be careful and correctly use comparison operators such as <= instead of <. This is a frequent mistake, almost like forgetting the semicolon at the end of the line, so we also have to be careful when it comes to creating infinite loops.
On the other hand, loops usually involve an additional effort to try and understand their intent. This is because the programmer must do a mental process, similar to reverse engineering, to get a better idea of the result that will be obtained when they execute a piece of code that contains some nested loops.
Often, the primary intent in loops could be replaced by a higher-order function, which provides better clarity about its intent. Let's look at an example of this. Let's take an array of numbers using JavaScript:
const arrayOfNumbers = [17, -4, 3.2, 8.9, -1.3, 0, Math.PI];
To obtain the sum of all the values of the array in JavaScript, we could write something like this:
let sum = 0;
arrayOfNumbers.forEach((number) => {
sum += number;
});
console.log(sum);
However, we could use the reduce
function to avoid using loops and the need to use mutable data, whose final intention is easier to comprehend:
const sum = arrayOfNumbers.reduce((accumulator, number) =>
accumulator + number
);
console.log(sum);
In summary, loops could imply that additional effort is needed to read written code and reduce its predictability.
Loop-free coding makes our code declarative instead of imperative, thus simplifying it so that we have a better understanding of its purpose. This also allows us to apply generalized reasoning and better integration to immutable data.
Bosque encourages us to write loop-free code by providing a series of useful methods for working with collections and arrays through algebraic bulk operations.
Indeterminate behaviors
Sometimes, when we compile a program, we don't think we'll encounter any problems. However, when we do execute the code, the result is different from what we had expected. This is called an indeterminate behavior.
Although some algorithms are non-deterministic by design, this implies that, by using the same input, we could expect different results. This could increase the complexity of controlling possible error scenarios.
Some examples of this include attempting to assign values in a matrix beyond the established limits, an unexpected resulting from a math operation due to type inference, or concurrent code execution.
Bosque proposes programs with unique results for the same incoming parameters in order to simplify how the written code is understood. It also helps us avoid errors due to unforeseen execution flows or unexpected behavior.
To achieve this goal, it is necessary to eliminate indeterminate behavior sources such as uninitialized variable support, mutable data, unstable enumeration order, and so on.
Additionally, Bosque proposes eliminating environmental sources of indeterminism such as I/O, random numbers, UUID generation transferring this responsibility to the runtime host, and decoupling it from the language core. Due to this, the host will be responsible for managing the environment's interaction.
That is why Bosque programmers will never see intermittent production failures and will always have more stable and reliable code.
Data invariant violations
The invariance concept guarantees that a property or condition will always comply with predefined conditions, so assumptions can be made without incurring errors. This allows for design-by-contract implementation.
Within a loop, the invariant is represented as the necessary instructions to bring the precondition's value toward the postcondition's fulfillment. For example, if we want to have the sum of numbers in an array, the invariant would be the instructions to accumulate each value's sum. Most programming languages provide operators with access and update elements in arrays/tuples/objects, which changes the state of an object through an imperative multi-step process, making it difficult to track all the changes as a range of mistakes can be made.
Bosque proposes that we use algebraic bulk data operators, which help us focus on the overall intent of the code through algebraic reasoning instead of using individual steps. Let's look at some examples of bulk operations that have been written in Bosque.
We can get the elements located at positions 0 and 2 with the following code:
(@[7 , 8 , 9 ] ) @[0 , 2 ] ; // @[7, 9]
Alternatively, we update the value of position 0 by 5 and assign position 3 the value of 1. Consequently, position 2 will be assigned none
:
(@[7 , 8] <~(0=5, 3=1); / @[5, 8, none, 1]
Then, we can add the value 5 at the end:
(@[7 , 8] <+(@[5]); // @[7, 8, 5]
Don't worry if you don't fully understand the previous code. We'll learn how to build bulk functions in more detail in Chapter 3, Bosque Key features.
Aliasing
At execution time, the same memory position can typically be accessed through different symbolic names. So, when we modify this value through one of these names, the value is changed for all the other names. We call this situation aliasing, which can generate unexpected behaviors and errors that will be difficult to identify.
A practical situation occurs in a buffer overflow scenario, as we know some programming languages allow manual memory management, so if the amount of data in the reserved area (buffer) is not adequately controlled, this additional data will be written in the adjacent area, thus overwriting the original content, which will produce unexpected behavior or a segmentation fault.
The programming languages that we use today having implementations based on particular hardware architectures, for example, how the information is stored in memory through different names or how pointer aliases work, because they could be different in some languages.
On the other hand, using aliases requires that we maintain a specific program execution order to obtain the expected behavior. Write access must be carried out in the same order in which it was written. This implies a big challenge for reordering optimizations.
The immutable nature of values in Bosque means that we only need to use read-only pointers, so analyzing aliases is not necessary, thus simplifying our problem.