Hands-On Reactive Programming in Spring 5

5 (2 reviews total)
By Oleh Dokuka , Igor Lozynskyi
    What do you get with a Packt Subscription?

  • Instant access to this title and 7,500+ eBooks & Videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Free Chapter
    Why Reactive Spring?
About this book

These days, businesses need a new type of system that can remain responsive at all times. This is achievable with reactive programming; however, the development of these kinds of systems is a complex task, requiring a deep understanding of the domain. In order to develop highly responsive systems, the developers of the Spring Framework came up with Project Reactor.

Hands-On Reactive Programming in Spring 5 begins with the fundamentals of Spring Reactive programming. You’ll explore the endless possibilities of building efficient reactive systems with the Spring 5 Framework along with other tools such as WebFlux and Spring Boot. Further on, you’ll study reactive programming techniques and apply them to databases and cross-server communication. You will advance your skills in scaling up Spring Cloud Streams and run independent, high-performant reactive microservices.

By the end of the book, you will be able to put your skills to use and get on board with the reactive revolution in Spring 5.1!

Publication date:
October 2018


Why Reactive Spring?

In this chapter, we are going to explain the concept of reactivity, looking at why reactive approaches are better than traditional approaches. To do this, we will look at examples in which traditional approaches failed. In addition to this, we will explore the fundamental principles of building a robust system, which is mostly referred to as reactive systems. We will also take an overview of the conceptual reasons for building message-driven communication between distributed servers, covering business cases in which reactivity fits well. Then, we will expand the meaning of reactive programming to build a fine-grained reactive system. We will also discuss why the Spring Framework team decided to include a reactive approach as the core part of Spring Framework 5. Based on the content of this chapter, we will understand the importance of reactivity and why it is a good idea to move our projects to the reactive world.

In this chapter, we will cover the following topics:

  • Why we need reactivity
  • The fundamental principles of the reactive system
  • Business cases in which a reactive system design matches perfectly
  • Programming techniques that are more suitable for a reactive system
  • Reasons for moving Spring Framework to reactivity

Why reactive?

Nowadays, reactive is a buzzword—so exciting but so confusing. However, should we still care about reactivity even if it takes an honorable place in conferences around the world? If we google the word reactive, we will see that the most popular association is programming, in which it defines the meaning of a programming model. However, that is not the only meaning for reactivity. Behind that word, there are hidden fundamental design principles aimed at building a robust system. To understand the value of reactivity as an essential design principle, let's imagine that we are developing a small business.

Suppose our small business is a web store with a few cutting-edge products at an attractive price. As is the case with the majority of projects in this sector, we will hire software engineers to solve any problems that we encounter. We opted for the traditional approaches to development, and, during a few development interactions, we created our store.

Usually, our service is visited by about one thousand users per hour. To serve the usual demand, we bought a modern computer and ran the Tomcat web server as well as configuring Tomcat's thread pool with 500 allocated threads. The average response time for the majority of user requests is about 250 milliseconds. By doing a naive calculation of the capacity for that configuration, we can be sure that the system can handle about 2,000 user requests per second. According to statistics, the number of users previously mentioned produced around 1,000 requests per second on average. Consequently, the current system's capacity will be enough for the average load.

To summarize, we configured our application with the margin regarding capacity. Moreover, our web store had been working stably until the last Friday in November, which is Black Friday.

Black Friday is a valuable day for both customers and retailers. For the customer, it is a chance to buy goods at discounted prices. And for retailers, it is a way to earn money and popularize products. However, this day is characterized by an unusual influx of clients, and that may be a significant cause of failure in production.

And, of course, we failed! At some point in time, the load exceeded all expectations. There were no vacant threads in the thread pool to process user requests. In turn, the backup server was not able to handle such an unpredictable invasion, and, in the end, this caused a rise in the response time and periodic service outage. At this point, we started losing some user requests, and, finally, our clients became dissatisfied and preferred dealing with competitors.

In the end, a lot of potential customers and money were lost, and the store's rating decreased. This was all a result of the fact that we couldn't stay responsive under the increased workload.

But, don't worry, this is nothing new. At one point in time, giants such as Amazon and Walmart also faced this problem and have since found a solution. Nevertheless, we will follow the same roads as our predecessors, gaining an understanding of the central principles of designing robust systems and then providing a general definition for them.

To learn more about giants failures see:

Now, the central question that should remain in our minds is—How should we be responsive? As we might now understand from the example given previously, an application should react to changes. This should include changes in demand (load) and changes in the availability of external services. In other words, it should be reactive to any changes that may affect the system's ability to respond to user requests.

One of the first ways to achieve the primary goal is through elasticity. This describes the ability to stay responsive under a varying workload, meaning that the throughput of the system should increase automatically when more users start using it and it should decrease automatically when the demand goes down. From the application perspective, this feature enables system responsiveness because at any point in time the system can be expanded without affecting the average latency.

Note that latency is the essential characteristic of responsiveness. Without elasticity, growing demand will cause the growth of average latency, which directly affects the responsiveness of the system.

For example, by providing additional computation resources or additional instances, the throughput of our system might be increased. The responsiveness will then increase as a consequence. On the other hand, if demand is low, the system should shrink in terms of resource consumption, thereby reducing business expenses. We may achieve elasticity by employing scalability, which might either be horizontal or vertical. However, achieving scalability of the distributed system is a challenge that is typically limited by the introduction of bottlenecks or synchronization points within the system. From the theoretical and practical perspectives, such problems are explained by Amdahl's Law and Gunther's Universal Scalability Model. We will discuss these in Chapter 6, WebFlux Async Non-Blocking Communication.

Here, the term business expenses refers to the cost of additional cloud instances or extra power consumption in the case of physical machines.

However, building a scalable distributed system without the ability to stay responsive regardless of failures is a challenge. Let's think about a situation in which one part of our system is unavailable. Here, an external payment service goes down, and all user attempts to pay for the goods will fail. This is something that breaks the responsiveness of the system, which may be unacceptable in some cases. For example, if users cannot proceed with their purchases easily, they will probably go to a competitor's web store. To deliver a high-quality user experience, we must care about the system's responsiveness. The acceptance criteria for the system are the ability to stay responsive under failures, or, in other words, to be resilient. This may be achieved by applying isolation between functional components of the system, thereby isolating all internal failures and enabling independence. Let's switch back to the Amazon web store. Amazon has many different functional components such as the order list, payment service, advertising service, comment service, and many others. For example, in the case of a payment service outage, we may accept user orders and then schedule a request auto-retry, thereby protecting the user from undesired failures. Another example might be isolation from the comments service. If the comments service goes down, the purchasing and orders list services should not be affected and should work without any problems.

Another point to emphasize is that elasticity and resilience are tightly coupled, and we achieve a truly responsive system only by enabling both. With scalability, we can have multiple replicas of the component so that, if one fails, we can detect this, minimize its impact on the rest of the system, and switch to another replica.

Message-driven communication

The only question that is left unclear is how to connect components in the distributed system and preserve decoupling, isolation, and scalability at the same time. Let's consider communication between components over HTTP. The next code example, doing HTTP communication in Spring Framework 4, represents this concept:

@RequestMapping("/resource")                                       // (1)
public Object processRequest() {
RestTemplate template =
new RestTemplate(); // (2)

ExamplesCollection result = template.getForObject( // (3)
"http://example.com/api/resource2", //
ExamplesCollection.class //
); //

... // (4)

processResultFurther(result); // (5)

The previous code is explained as follows:

  1. The code at this point is a request handler mapping declaration that uses the  @RequestMapping annotation.
  2. The code declared in this block shows how we may create the RestTemplate instance. RestTemplate is the most popular web client for doing request-response communication between services in Spring Framework 4.
  3. This demonstrates the request's construction and execution. Here, using the RestTemplate API, we construct an HTTP request and execute it right after that. Note that the response will be automatically mapped to the Java object and returned as the result of the execution. The type of response body is defined by the second parameter of the getForObject method. Furthermore, the getXxxXxxxxx prefix means that the HTTP method, in that case, is GET.
  4. These are the additional actions that are skipped in the previous example.
  5. This is the execution of another processing stage.

In the preceding example, we defined the request handler which will be invoked on users' requests. In turn, each invocation of the handler produces an additional HTTP call to an external service and then subsequently executes another processing stage. Despite the fact that the preceding code may look familiar and transparent in terms of logic, it has some flaws. To understand what is wrong in this example, let's take an overview of the following request's timeline:

Diagram 1.1. Components interaction timeline

This diagram depicts the actual behavior of the corresponding code. As we may notice, only a small part of the processing time is allocated for effective CPU usage whereas the rest of the time thread is being blocked by the I/O and cannot be used for handling other requests.

In some languages, such as C#, Go, and Kotlin, the same code might be non-blocking when green threads are used. However, in pure Java, we do not have such features yet. Consequently, the actual thread will be blocked in such cases.

On the other hand, in the Java world, we have thread pools, which may allocate additional threads to increase parallel processing. However, under a high load, such a technique may be extremely inefficient to process the new I/O task simultaneously. We will revisit this problem again during this chapter and also analyze it thoroughly in Chapter 6, WebFlux Async Non-Blocking Communication.

Nonetheless, we can agree that to achieve better resource utilization in I/O cases, we should use an asynchronous and non-blocking interaction model. In real life, this kind of communication is messaging. When we get a message (SMS, or email), all our time is taken up by reading and responding. Moreover, we do not usually wait for the answer and work on other tasks in the meantime. Unmistakably, in that case, work is optimized and the rest of the time may be utilized efficiently. Take a look at the following diagram:

To learn more about terminology see the following links:
Diagram 1.2. Non-blocking message communication

In general, to achieve efficient resource utilization when communicating between services in a distributed system, we have to embrace the message-driven communication principle. The overall interaction between services may be described as follows—each element awaits the arrival of messages and reacts to them, otherwise lying dormant, and vice versa, a component should be able to send a message in the non-blocking fashion. Moreover, such an approach to communication improves system scalability by enabling location transparency. When we send an email to the recipient, we care about the correctness of the destination address. Then the mail server takes care of delivering that email to one of the available devices of the recipient. This frees us from concerns about the certain device, allowing recipients to use as many devices as they want. Furthermore, it improves failure tolerance since the failure of one of the devices does not prevent recipients from reading an email from another device.

One of the ways to achieve message-driven communication is by employing a message broker. In that case, by monitoring the message queue, the system is able to control the load management and elasticity. Moreover, the message communication gives clear flow control and simplifies the overall design. We will not get into specific details of this in this chapter, as we will cover the most popular techniques for achieving message-driven communication in Chapter 8, Scaling Up with Cloud Streams.

The phrase lying dormant was taken from the following original document, which aims to emphasize message-driven communication: https://www.reactivemanifesto.org/glossary#Message-Driven.

By embracing all of the previous statements, we will get the foundational principles of the reactive system. This is depicted in the following diagram:

Diagram 1.3. Reactive Manifesto

As we may notice from the diagram, the primary value for any business implemented with a distributed system is responsiveness. Achieving a responsive system means following fundamental techniques such as elasticity and resilience. Finally, one of the fundamental ways to attain a responsive, elastic, and resilient system is by employing message-driven communication. In addition, systems built following such principles are highly maintainable and extensible, since all components in the system are independent and properly isolated.

We will not go all notions defined in the Reactive Manifesto in depth, but it is highly recommended to revisit the glossary provided at the following link: https://www.reactivemanifesto.org/glossary.

All those notions are not new and have already been defined in the Reactive Manifesto, which is the glossary that describes the reactive system's concepts. This manifesto was created to ensure that businesses and developers have the same understanding of conventional notions. To emphasize, a reactive system and the Reactive Manifesto are concerned with architecture, and this may be applied to either large distributed applications or small one-node applications.

The importance of the Reactive Manifesto (https://www.reactivemanifesto.org) is explained by Jonas Bonér, the Founder and CTO of Lightbend, at the following link: https://www.lightbend.com/blog/why_do_we_need_a_reactive_manifesto%3F.

Reactivity use cases

In the previous section, we learned the importance of reactivity and the fundamental principles of the reactive system, and we have seen why message-driven communication is an essential constituent of the reactive ecosystem. Nonetheless, to reinforce what we have learned, it is necessary to touch on real-world examples of its application. First of all, the reactive system is about architecture, and it may be applied anywhere. It may be used in simple websites, in large enterprise solutions, or even in fast-streaming or big-data systems. But let's start with the simplest—consider the example of a web store that we have already seen in the previous section. In this section, we will cover possible improvement and changes in the design that may help in achieving a reactive system. The following diagram helps us get acquainted with the overall architecture of the proposed solution:

Diagram 1.4. Example of store application architecture

The preceding diagram expands a list of useful practices that allow the reactive system to be achieved. Here, we improved our small web store by applying modern microservice patterns. In that case, we use an API Gateway pattern for achieving location transparency. It provides the identification of a specific resource with no knowledge about particular services that are responsible for handling requests.

However, it means that the client should know the resource name at least. Once the API Gateway receives the service name as part of a request URI, then it can resolve a specific service address by asking the registry service.

In turn, the responsibility for keeping information about available services up to date is implemented using the service registry pattern and achieved with the support of the client-side discovery pattern. It should be noticed, that in the previous example, the service gateway and service registry are installed on the same machine, which may be useful in the case of a small distributed system. Additionally, the high responsiveness of the system is achieved by applying replication to the service. On the other hand, failure tolerance is attained by properly employed message-driven communication using Apache Kafka and the independent Payment Proxy Service (the point with Retry N times description in Diagram 1.4), which is responsible for redelivering payment in the case of unavailability of the external system. Also, we use database replication to stay resilient in the case of the outage of one of the replicas. To stay responsive, we return a response about an accepted order immediately and asynchronously process and send the user payment to the payments service. A final notification will be delivered later by one of the supported channels, for example, via email. Finally, that example depicts only one part of the system and in real deployments, the overall diagram may be broader and introduce much more specific techniques for achieving a reactive system.

Note, we will cover design principles and their pros and cons thoroughly in Chapter 8, Scaling Up with Cloud Streams.

To familiarize ourselves with API Gateway, Service Registry, and other patterns for constructing a distributed system, please click on the following link: http://microservices.io/patterns.

Along with the plain, small web store example that may seem really complex, let's consider another sophisticated area where a reactive system approach is appropriate. A more complex but exciting example is analytics. The term analytics means that the system that is able to handle a huge amount of data, process it in run-time, keep the user up to date with live statistics, and so on. Suppose we are designing a system for monitoring a telecommunication network based on cell site data. Due to the latest statistic report of the number of cell towers, in 2016 there were 308,334 active sites in the USA.

The statistic report with the number of cell sites in the USA  is available at the following link: https://www.statista.com/statistics/185854/monthly-number-of-cell-sites-in-the-united-states-since-june-1986/.

Unfortunately, we can just imagine the real load produced by that number of cell sites. However, we can agree that processing such a huge amount of data and providing real-time monitoring of the telecommunication network state, quality, and traffic is a challenge.

To design this system, we may follow one of the efficient architectural techniques called streaming. The following diagram depicts the abstract design of such a streaming system:

Diagram 1.5. Example of an analytics real-time system architecture

As may be noticed from this diagram, streaming architecture is about the construction of the flow of data processing and transformation. In general, such a system is characterized by low latency and high throughput. In turn, the ability to respond or simply deliver analyzed updates of the telecommunication network state is therefore crucial. Thus, to build such a highly-available system, we have to rely on fundamental principles, as mentioned in the Reactive Manifesto. For example, achieving resilience might be done by enabling backpressure support. Backpressure refers to a sophisticated mechanism of workload management between processing stages in such a way that ensures we do not overwhelm another. Efficient workload management may be achieved by using message-driven communication over a reliable message broker, which may persist messages internally and send messages on demand.

Note that other techniques for handling backpressure will be covered in Chapter 3, Reactive Streams - the New Streams' Standard.

Moreover, by properly scaling each component of the system, we will be able to elastically expand or reduce system throughput.

To learn more about the terminology, see the following link:
Backpressure: https://www.reactivemanifesto.org/glossary#Back-Pressure.

In a real-world scenario, the stream of the data may be persisted databases processed in a batch, or partially processed in real-time by applying windowing or machine-learning techniques. Nonetheless, all fundamental principles offered by the Reactive Manifesto are valid here, regardless of the overall domain or business idea. 

To summarize, there are a ton of different areas in which to apply the foundational principles of building a reactive system. The area of application of the reactive system is not limited to the previous examples and areas, since all of these principles may be applied to building almost any kind of distributed system oriented to giving users effective, interactive feedback.

Nonetheless, in the next section, we will cover the reasons for moving Spring Framework to reactivity.


Why Reactive Spring? 

In the previous section, we looked at a few interesting examples in which reactive system approaches shine. We have also expanded on the usage of fundamentals such as elasticity and resilience, and seen examples of microservice-based systems commonly used to attain a reactive system.

That gave us an understanding of the architectural perspective but nothing about the implementation. However, it is important to emphasize the complexity of the reactive system and the construction of such a system is a challenge. To create a reactive system with ease, we have to analyze frameworks capable of building such things first and then choose one of them. One of the most popular ways to choose a framework is by analyzing its available features, relevance, and community.

In the JVM world, the most commonly known frameworks for building a reactive system has been Akka and Vert.x ecosystems.

On the one hand, Akka is a popular framework with a huge list of features and a big community. However, at the very beginning, Akka was built as part of the Scala ecosystem and for a long time, it showed its power only within solutions written in Scala. Despite the fact that Scala is a JVM-based language, it is noticeably different from Java. A few years ago, Akka provided direct support for Java, but for some reason, it was not as popular in the Java world as it was in Scala.

On the other hand, there is the Vert.x framework which is also a powerful solution for building an efficient reactive system. Vert.x was designed as a non-blocking, event-driven alternative to Node.js that runs on the Java Virtual Machine. However, Vert.x started being competitive only a few years ago and during the last 15 years, the market for frameworks for flexible robust application development has been held by the Spring Framework. 

The Spring Framework provides wide possibilities for building a web application using a developer-friendly programming model. However, for a long time, it had some limitations in building a robust reactive system.

Reactivity on the service level

Fortunately, the growing demand for reactive systems initiated the creation of a new Spring Project called Spring Cloud. The Spring Cloud Framework is a foundation of projects that address particular problems and simplifies the construction of distributed systems. Consequently, the Spring Framework ecosystem may be relevant for us to build reactive systems.

To learn more about the essential functionality, components, and features of that project please click on the following link: http://projects.spring.io/spring-cloud/.

We will skip the details of Spring Cloud Framework functionality in this chapter and cover the most important parts that help in the development of the reactive system in Chapter 8, Scaling Up with Cloud Streams. Nonetheless, it should be noticed that such a solution building a robust, reactive microservices system with minimum effort.

However, the overall design is only one element of constructing the whole reactive system. As may be noticed from the excellent Reactive Manifesto:

"Large systems are composed of smaller ones and therefore depend on the Reactive properties of their constituents. This means that Reactive Systems apply design principles so these properties apply at all levels of scale, making them able to be composed".

Therefore, it is important to provide a reactive design and implementation on the component level as well. In that context, the term design principle refers to a relationship between components and, for example, programming techniques that are used to compound elements. The most popular traditional technique for writing code in Java is imperative programming.

To understand whether imperative programming follows reactive system design principles, let's consider the next diagram:

Diagram 1.6. UML Schema of component relationship

Here, we have two components within the web store application. In that case, OrdersService calls ShoppingCardService while processing the user request. Suppose that under the hood ShoppingCardService executes a long-running I/O operation, for example, an HTTP request or database query. To understand the disadvantages of imperative programming let's consider the following example of the most common implementation of the aforementioned interaction between components:

interface ShoppingCardService {                                    // (1)
Output calculate(Input value); //

} //

class OrdersService { // (2)
private final ShoppingCardService scService; //
void process() { //
Input input = ...; //
Output output = scService.calculate(input); // (2.1)
... // (2.2)
} //
} //

The aforementioned code is explained as follows:

  1. This is the ShoppingCardService interface declaration. This corresponds to the aforementioned class diagram and has only one calculate method, which accepts one argument and returns a response after its processing.
  2. This is the OrderService declaration. Here, at point (2.1) we synchronously call  ShoppingCardService and receive a result right after its execution. Point (2.2) hides the rest of the code responsible for result processing.
  3. In turn, in that case our services are tightly coupled in time, or simply the execution of OrderService is tightly coupled to the execution of ShoppingCardService. Unfortunately, with such a technique, we cannot proceed with any other actions while ShoppingCardService is in the processing phase.

As we can understand from the preceding code, in Java world, the execution of scService.calculate(input) blocks the Thread on which the processing of the OrdersService logic takes place. Thus, to run a separate independent processing in OrderService we have to allocate an additional Thread. As we will see in this chapter, the allocation of an additional Thread might be wasteful. Consequently, from the reactive system perspective, such system behavior is unacceptable.

Blocking communications directly contradicts the message-driven principle, which explicitly offers us non-blocking communication. See the following for more information on this: https://www.reactivemanifesto.org/#message-driven.

Nonetheless, in Java, that problem may be solved by applying a callback technique for the purpose of  cross-component communication:

interface ShoppingCardService {                                    // (1)
void calculate(Input value, Consumer<Output> c); //
} //

class OrdersService { // (2)
private final ShoppingCardService scService; //
void process() { //
Input input = ...; //
scService.calculate(input, output -> { // (2.1)
... // (2.2)
}); //
} //
} //

Each point in the preceding code is explained in the following numbered list:

  1. The preceding code is the ShoppingCardService interface declaration. In that case, the calculate method accepts two parameters and returns a void. It means that from the design perspective, the caller may be immediately released from waiting and the result will be sent to the given Consumer<> callback later.
  2. This is the OrderService declaration. Here, at point (2.1) we asynchronously call  ShoppingCardService and continue processing. In turn, when the ShoppingCardService executes the callback function we will be able to proceed with the actual result processing (2.2).

Now, OrdersService passes the function-callback to react at the end of the operation. This embraces the fact that OrdersService is now decoupled from ShoppingCardService and the first one may be notified via the functional callback where the implementation of the ShoppingCardService#calculate  method, which calls the given function, may either be synchronous or asynchronous:

class SyncShoppingCardService implements ShoppingCardService {     // (1)
public void calculate(Input value, Consumer<Output> c) { //
Output result = new Output(); //
c.accept(result); // (1.1)
} //
} //

class AsyncShoppingCardService implements ShoppingCardService { // (2)
public void calculate(Input value, Consumer<Output> c) { //
new Thread(() -> { // (2.1)
Output result = template.getForObject(...); // (2.2)
... //
c.accept(result); // (2.3)
}).start(); // (2.4)
} //
} //

Each point in the preceding code is explained in the following numbered list:

  1. This point is the SyncShoppingCardService class declaration. This implementation assumes the absence of blocking operations. Since we do not have an I/O execution, the result may be returned immediately by passing it to the callback function (1.1).
  2. This point in the preceding code is the AsyncShoppingCardService class declaration. In the case, when we have blocking I/O as depicted in point (2.2), we may wrap it in the separate Thread (2.1) (2.4). After retrieving the result,  it will be processed and passed to the callback function.

In that example, we have the sync implementation of ShoppingCardService, which keeps synchronous bounds and offers no benefits from the API perspective. In the async case, we achieve asynchronous bounds, and a request will be executed in the separate ThreadOrdersService is decoupled from the execution process and will be notified of the completion by the callback execution.

The advantage of that technique is that components are decoupled in time by the callback function. This means that after calling the scService.calculate method, we will be able to proceed with other operations immediately without waiting for the response in the blocking fashion from ShoppingCardService.

The disadvantage is that callback requires the developer to have a good understanding of multi-threading to avoid the traps of shared data modifications and callback hell.

Actually, the phrase callback hell is mentioned in relation to JavaScript: http://callbackhell.com, but it is also applicable to Java as well.

Fortunately, the callback technique is not the only option. Another one is  java.util.concurrent.Futurewhich, to some degree, hides the executional behavior and decouples components as well:

interface ShoppingCardService {                                    // (1)
Future<Output> calculate(Input value); //
} //

class OrdersService { // (2)
private final ShoppingCardService scService; //
process() { //
Input input = ...; //
Future<Output> future = scService.calculate(input); // (2.1)
... //
Output output = future.get(); // (2.2)
... //
} //
} //

The numbered points are described in the following:

  1. At this point is the ShoppingCardService interface declaration. Here, the calculate method accepts one parameter and returns Future. Future is a class wrapper which allows us to check whether there is an available result or blocking to get it.
  2. This is the OrderService declaration. Here, in point (2.1), we asynchronously call ShoppingCardService and receive the Future instance. In turn, we are able to continue processing while the result is being processed asynchronously. After some execution, which may be done independently from ShoppingCardService#calculation, we get the result. This result may end up waiting in the blocking fashion or it may immediately return the result (2.2).

As we may notice from the previous code, with the Future class, we achieve deferred retrieval of the result. With the support of the Future class, we avoid callback hell and hide multi-threading complexity behind a specific Future implementation. Anyway, to get the result we need, we must potentially block the current Thread and synchronize with the external execution that noticeably decreases scalability.

As an improvement, Java 8 offers CompletionStage and CompletableFuture as a direct implementation for CompletionStage. In turn, those classes provide promise-like APIs and make it possible to build code such as the following:

To learn more about futures and promises, please see the following link: https://en.wikipedia.org/wiki/Futures_and_promises.
interface ShoppingCardService {                                    // (1)
CompletionStage<Output> calculate(Input value); //
} //

OrdersService { // (2)
private final ComponentB componentB; //
void process() { //
Input input = ...; //
componentB.calculate(input) // (2.1)
.thenApply(out1 -> { ... }) // (2.2)
.thenCombine(out2 -> { ... }) //
.thenAccept(out3 -> { ... }) //
} //
} //

The aforementioned code is described in the following:

  1. At this point, we have the ShoppingCardService interface declaration. In this case, the calculate method accepts one parameter and returns CompletionStage. CompletionStage is a class wrapper that is similar to Future but allows processing the returned result in the functional declarative fashion.
  2. This is an OrderService declaration. Here, at point (2.1) we asynchronously call  ShoppingCardService and receive the CompletionStage immediately as the result of the execution. The overall behavior of the CompletionStage is similar to Future, but CompletionStage provides a fluent API which makes it possible to write methods such as thenAccept and thenCombine. These define transformational operations on the result and thenAccept, which defines the final consumers, to handle the transformed result.

With the support of CompletionStage, we can write code in the functional and declarative style, which looks clean and processes the result asynchronously. Furthermore, we may omit the awaiting results and provide a function to handle the result when it becomes available. Moreover, all of the previous techniques are valued by Spring teams and have already been implemented within most of the projects within the framework. Even though the CompletionStage gives better possibilities for writing efficient and readable code, unfortunately, there are some missing points there. For example, Spring 4 MVC did not support CompletionStage for a long time and for that purpose, it provided its own ListenableFuture. This happened because Spring 4 aimed to become compatible with older Java versions. Let's take an overview of AsyncRestTemplate usage to get an understanding of how to work with Spring's ListenableFuture. The following code shows how we may use ListenableFuture with AsyncRestTemplate:

AsyncRestTemplate template = new AsyncRestTemplate(); 
SuccessCallback onSuccess = r -> { ... };
FailureCallback onFailure = e -> { ... };
ListenableFuture<?> response = template
response.addCallback(onSuccess, onFailure);

The preceding code shows the callback style for handling an asynchronous call. Essentially, this method of communication is a dirty hack, and Spring Framework wraps blocking network calls in a separate thread under-the-hood. Furthermore, Spring MVC relies on Servlet API, which obligates all implementations to use the thread-per-request model.

Many things have changed with the release of Spring Framework 5 and the new Reactive WebClient, so with the support of WebClient, all cross-service communication is non-blocking anymore. Also, Servlet 3.0 introduced asynchronous client-server communication, Servlet 3.1 allowed non-blocking writing to I/O, and in general new asynchronous non-blocking features of the Servlet 3 API are well integrated into Spring MVC. However, the only problem was that Spring MVC did not provide an out of the box asynchronous non-blocking client that negates all benefits from improved servlets.

This model is quite non-optimal. To understand why this technique is inefficient, we have to revisit the costs of multi-threading. On the one hand, multi-threading is a complex technique by nature. When we work with multi-threading, we have to think about many things, such as access to shared memory from the different threads, synchronization, error handling, and so on. In turn, the design of multi-threading in Java supposes that a few threads may share a single CPU to run their tasks simultaneously. The fact that CPU time will be shared between several threads introduces the notion of context switching. This means that to resume a thread later, it is required to save and load registers, memory maps, and other related elements which in general are computationally-intensive operations. Consequently, its application with a high number of active threads, and few CPUs, will be inefficient.

To learn more about the cost of context switching, please visit the following link: https://en.wikipedia.org/wiki/Context_switch#Cost.

In turn, a typical Java thread has its overhead in memory consumption. A typical stack size for a thread on a 64-bit Java VM is 1,024 KB. On the one hand, an attempt to handle ~6,4000 simultaneous requests in a thread per connection model may result in about 64 GB of used memory. This might be costly from the business perspective or critical from the application standpoint. On the other hand, by switching to traditional thread pools with a limited size and a pre-configured queue for requests, the client waits too long for a response, which is less reliable, increases the average response timeout, and finally may cause unresponsiveness of the application.

For that purpose, the Reactive Manifesto recommends using a non-blocking operation, and this is an omission in the Spring ecosystem. On the other hand, there is no good integration with reactive servers such as Netty, which solves the problem of context switching.

To get source information about the average amount of connections, see the following link: https://stackoverflow.com/questions/2332741/what-is-the-theoretical-maximum-number-of-open-tcp-connections-that-a-modern-lin/2332756#2332756.

The term thread refers to allocated memory for the thread object and allocated memory for the thread stack. See the next link for more information:

It is important to note that asynchronous processing is not limited to a plain request-response pattern, and sometimes we have to deal with handling infinitive streams of data, processing it in the manner of an aligned transformation flow with backpressure support:

Diagram 1.7. Reactive pipeline example

One of the ways for handling such cases is through reactive programming, which embraces the techniques of asynchronous event processing through chaining transformational stages. Consequently, reactive programming is a good technique which fits the design requirements for a reactive system. We will cover the value of applying reactive programming for building a reactive system in the next chapters.

Unfortunately, the reactive programming technique was not well integrated inside Spring Framework. That put another limitation on building modern applications and decreased the competitiveness of the framework. As a consequence, all the mentioned gaps in the growing hype around reactive systems and reactive programming simply increased the need for dramatic improvements within the framework. Finally, that drastically stimulated the improvement of Spring Framework by adding the support for Reactivity on all levels and providing developers with a powerful tool for reactive system development. Its pivotal developers decided to implement new modules that reveal the whole power of Spring Framework as a reactive system foundation.



In this chapter, we highlighted the requirements for cost-efficient IT solutions that often arise nowadays. We described why and how big companies such as Amazon failed to force old architectural patterns to work smoothly in current cloud-based distributed environments.

We also established the need for new architectural patterns and programming techniques to fulfill the ever-growing demand for convenient, efficient, and intelligent digital services. With the Reactive Manifesto, we deconstructed and comprehended the term reactivity and also described why and how elasticity, resilience, and message-driven approaches help to achieve responsiveness, probably the primary non-functional system requirement in the digital era. Of course, we gave examples in which the reactive system shines and easily allows businesses to achieve their goals.

In this chapter, we have highlighted a clear distinction between a reactive system as an architectural pattern and reactive programming as a programming technique. We described how and why these two types of reactivity play well together and enable us to create highly efficient die-hard IT solutions.

To go deeper into Reactive Spring 5, we need to gain a solid understanding of the reactive programming basement, learning essential concepts and patterns that determine the technique. Therefore, in the next chapter, we will learn the essentials of reactive programming, its history, and the state of the reactive landscape in the Java world.

About the Authors
  • Oleh Dokuka

    Oleh Dokuka is an experienced software engineer, Pivotal Champion, and one of the top contributors to Project Reactor and Spring Framework. He knows the internals of both frameworks very well and advocates reactive programming with Project Reactor on a daily basis. Along with that, the author applies Spring Framework and Project Reactor in software development, so he knows how to build reactive systems using these technologies.

    Browse publications by this author
  • Igor Lozynskyi

    Igor Lozynskyi is a senior Java developer who primarily focuses on developing reliable, scalable, and blazingly fast systems. He has over seven years of experience with the Java platform. He is passionate about interesting and dynamic projects both in life and in software development.

    Browse publications by this author
Latest Reviews (2 reviews total)
Excellent Book. Recommend to Everyone!
El libro trata desde los fundamentos reactivos, e introduce muy bien en el porque se ocupan. Había tomado un curso rápido, y enseñaban una receta, pero no explicaban el porque de la cosas. Con este libro siento que estoy aprendiendo realmente.
Recommended For You
Hands-On Reactive Programming in Spring 5
Unlock this book and the full library FREE for 7 days
Start now