Redis Stack for Application Modernization

Introducing Redis Stack

Redis has achieved several important milestones since its inception in 2009, from taking the lead as the most popular key-value data store, according to the ranking published every month by the website DB-Engines (and the sixth among all database systems), up to establishing the record as the most downloaded container image on Docker. Not to mention that Redis has been the most loved database for five years in a row, according to the Developer Survey published by Stack Overflow in the years 2016-2021. And, for sure, you, or a friend of yours, have used it for some reason, work, or hobby.

If you are reading this book, chances are you have programmed an application using a Redis server, or at least you know what it is and what it is used for. In this chapter, we’ll recap what made Redis the most famous caching system in the world and we’ll share some anecdotes about the development undertaken by its creator, Salvatore Sanfilippo. We won’t stay long on the story of Redis, though, because this book is about application modernization. As you read through, you will discover how the original database, designed for speed and simplicity, has evolved to resolve many of the new challenges of this age, without compromising on the ease of adoption, flexibility, and, above all, speed.

Redis Stack is an extension of Redis presented in 2022, which introduces JSON, vector, and time series data modeling capabilities, all supporting real-time queries and searches. Redis Stack represents a new approach to providing a rich data modeling experience all within the same database server. It introduces features such as vector similarity search to query structured and unstructured data (for example, text, images, or audio files) and delivers probabilistic Bloom filters to efficiently resolve recurrent big data problems. Redis Stack is also a data platform that supports event-driven programming and introduces stream processing features. By the end of this chapter, you will understand what Redis Stack is and how it enhances the Redis server with many new capabilities. Above all, you will learn the motivation behind Redis Stack and why multi-model databases can increase the speed of technological innovation for organizations of all sizes. In this chapter, we are going to cover the following topics:

Exploring the history of Redis
The open source project
From key-value to multi-model real-time databases
Redis Stack deployment types

Exploring the history of Redis

Redis was conceived and designed in 2009 by the Italian software engineer Salvatore Sanfilippo as a solution to scaling LLOOGG, an online analytics server co-founded with Fabio Pitrola that empowered web admins to track user activities. Challenged by the scalability limitations of MySQL, Salvatore decided to rethink the concept of key-value storage and design something that would (admittedly) be different from Memcached, while preserving its simplicity and speed. The first beta release was shared on Google Code on February 25, 2009. A few months later, in September 2009, the first stable release, Redis 1.0, was published as a tar package of less than 200 KB.

Redis has been designed to offer an alternative for problems where relational databases (RDBMSs) are not a good fit because there is something wrong if we use an RDBMS for all kinds of work. However, in comparison to other data storage options that became popular when the NoSQL wave shook the world of databases (Memcached, the key-value data store released in 2003, or MongoDB, the document store released in 2009, and many more), Redis has its roots in computer science and makes a rich variety of data structures available. This is one of the distinguishing features of Redis and the likely reason that fostered its adoption by software engineers and developers – presenting data structures such as hashes, lists, sets, bitmaps, and so on that are familiar to software engineers so they could transfer the programming logic to data modeling without any lengthy and computationally expensive data transformation. Viewed in this light, we could say that Redis is about persisting the data structures of a programming language. An example of the simplicity of storing a Python dictionary in a Redis hash data structure follows:

user = {"name":"John",
        "surname":"Smith",
        "company":"Redis",
        "department":"Sales"}
r.hset("user:{}".format(str(2345)), mapping=user)

In the same way, adding elements to a Redis Set can be done using Python lists:

languages = ['Python', 'C++', 'JavaScript']
r.sadd("coding", *languages)

In these examples, the user dictionary and the languages list are stored without transformations, and this is one of the advantages that Redis data structures offer to developers: simplifying data modeling and reducing the transformational overhead required to convert the data in a format that can be mapped to the data store (thus reducing the so-called impedance mismatch).

There was a short gap between the first release and its adoption by Instagram and GitHub. If we try to dig into the reasons that made Redis so popular, we can mention a few, among which we count the speed and simplicity of deployment. Beyond the user experience, Redis is an act of dedication and passion, and as we read in Redis’s own manifesto, code is like poetry; it’s not just something we write to reach some practical result. People love beautiful stories and simplicity and everybody should fight against complexity.

What is surely true is that Redis is an idea to solve problems where relational databases, still tied to rigid paradigms, wouldn’t fit the purpose. It is the product of creativity, inspiration, and love for things done manually, where good design and craftsmanship intertwine to accomplish something that simply works. An intimate artwork. And we like to recall Salvatore’s words about the creative approach when writing Redis:

My wife claims I wrote it mostly while sitting on the WC for the first years, on a MacBook Air 11. Would be nice to tell her she is wrong, but she happens to be perfectly right about the matter.

From the most-used thinking room in Sicily to becoming the most-loved and used key-value database in the world, this is the story we have decided to tell in this book, and we are sure you will find the journey through the pages an exciting adventure.

One of the guiding principles behind Redis is being open source and driven by a community of enthusiast contributors. We’ll explore that in the next section.

The open source project

The success of a technical project is always measurable in terms of the innovation of the proposal, simplicity of use, exhaustive documentation, high performance, low footprint, and stability, among other aspects. However, and this is true for many things, at the end of the day what matters is the capacity to resolve a problem and the impact of the solution. Organizations that decide to add new technology to their stack face several challenges to understand, prototype, validate, and set up a plan to deploy test environments together with a release strategy, a maintenance plan, and, finally, a plan to develop competence. Success stories require careful planning. From these many perspectives, Redis is considered first-in-class, and in this book, we will expose many of the reasons that made Redis the de-facto standard among the in-memory data stores in the world. But even before digging into the features of Redis Stack, Redis, as an open source project, has undoubtedly added value to many businesses:

A variety of options exist to get it running close to the application. It is available as a managed service in every public cloud provider, it can be installed from the source code or as a binary file, and Docker images are available for all the versions and flavors.
It has good documentation and a command reference, together with examples (from the https://redis.io/ website).
It is straightforward to set up and test. The source code is self-contained and does not depend on external libraries.
Client libraries for the most popular programming languages are available and supported (Java, JavaScript, Python, Go, and C#/.NET).
The well-known and permissive BSD license grants the freedom to use, modify, and distribute Redis, among other advantages. Users can test and run Redis in production without any concerns.

These reasons, together with the fact that it’s very easy to learn Redis, make it an attractive option to set up and use. On a computer configured to build C projects, pulling the source code from the GitHub repository, compiling it, and running the server can be done in less than a minute:

git clone https://github.com/redis/redis.git
cd redis/
make
./src/redis-server &
./src/redis-cli PING
PONG

The open source project delivers the core Redis server plus additional utilities, such as these:

redis-cli, the command-line interface to administer the server and manage the data. This utility assists also in configuring scalable deployments with Redis Cluster and high availability with replication and Sentinel. Among other features, it includes auto-completion, online help for single commands (for example, HELP HSET), or by group of commands (for example, HELP @hash, to learn about the commands that can be used with the Hash data structure). Just type HELP to understand how to make use of the online help.
redis-benchmark, a simple benchmarking utility to perform batches of tests for different data structures. Useful to evaluate how well the server performs on determined hardware.
redis-sentinel, the agent that automates the management of replicated topologies and provides clients with a discovery service.
create-cluster, a utility useful to set up a Redis Cluster environment for testing.
redis-check-rdb and redis-check-aof, utilities to health check AOF and RDB persistence files.

Now that we have reviewed the basic principles behind Redis and its utilities, we are ready to dive into the world of data modeling. This journey will take us from relational databases to Redis core data structures, and we will see how the multi-model capabilities of Redis Stack simplify many data modeling problems.

From key-value to multi-model real-time databases

The core data structures that are available out of the box in the Redis server solve a variety of problems when it comes to mapping entities and relationships. To start with concrete examples of modeling using Redis, the usual option to store an object is the Hash data structure, while collections can be stored using Sets, Sorted Sets, or Lists (among other options because a collection can be modeled in several other ways). In this section, we will introduce the multi-model features of Redis Stack using a comprehensive approach, which may be useful for those who are used to storing data using the relational paradigm, which implies organizing the data in rows and columns of a table.

Consider the requirement to model a list of cities. Using the relational data model, we can define a table using the SQL data definition language (DDL) instruction CREATE TABLE as follows:

CREATE TABLE `city` (
  `ID` int NOT NULL AUTO_INCREMENT,
  `Name` char(35) NOT NULL DEFAULT '',
  `CountryCode` char(3) NOT NULL DEFAULT '',
  `District` char(20) NOT NULL DEFAULT '',
  `Population` int NOT NULL DEFAULT '0',
  PRIMARY KEY (`ID`),
  KEY `CountryCode` (`CountryCode`)
)

This table definition defines attributes for the city entity and specifies a primary key on an integer identifier (a surrogate key, in this case, provided the uniqueness of the attributes is not guaranteed for the city entity). The DDL command also defines an index on the CountryCode attribute. Data encoding, collation, and the specific technology adopted as the storage engine are not relevant in this context. We are focused on understanding the model and the ability that we have to query it.

Primary key lookup

Primary key lookup is the most efficient way to access data in a relational table. Filtering the table on the primary key attribute is as easy as executing the SQL SELECT statement:

SELECT * FROM city WHERE ID=653;
+-----+--------+-------------+----------+------------+
| ID  | Name   | CountryCode | District | Population |
+-----+--------+-------------+----------+------------+
| 653 | Madrid | ESP         | Madrid   |    2879052 |
+-----+--------+-------------+----------+------------+
1 row in set (0.00 sec)

Modeling a city using one of the Redis core data structures leads to mapping the data in the SQL table to Hashes, so we can store the attributes as field-value pairs, with the key name including the primary key:

127.0.0.1:6379> HSET city:653 Name "Madrid" CountryCode "ESP" District "Madrid" Population 2879052

The HGETALL command can be used to retrieve the entire hash with minimal overhead (HGETALL has direct access to the value in the Redis keyspace):

HGETALL city:653
1) "Name"
2) "Madrid"
3) "CountryCode"
4) "ESP"
5) "District"
6) "Madrid"
7) "Population"
8) "2879052"

In addition, we can limit the bandwidth usage caused by the entire row transfer to the client and select only specific attributes. The SQL syntax is as follows:

SELECT Name, Population FROM city WHERE ID=653;
+--------+------------+
| Name   | Population |
+--------+------------+
| Madrid |    2879052 |
+--------+------------+
1 row in set (0.00 sec)

In this analogy between the relational model and Redis, the command is HGET (or HMGET for multiple values):

127.0.0.1:6379> HMGET city:653 Name Population
1) "Madrid"
2) "2879052"

While we need to extract data based on the primary key identifier, the solution is at hand in both the relational database and in Redis. Things get more complicated if we want to perform lookup and search queries on the dataset. In the next examples, we’ll see how the complexity and performance of such operations may vary substantially.

Secondary key lookup

Primary key lookups are efficient: after all, the primary key is an index, and it guarantees direct access to the table row. But what if we want to search for cities by filtering on an attribute? Let’s try an indexed search against our relational database over the CountryCode column, which has a secondary index:

mysql> SELECT Name FROM city WHERE CountryCode = "ESP";
+--------------------------------+
| Name                           |
+--------------------------------+
| Madrid                         |
| Barcelona                      |
| [...]                          |
+--------------------------------+
59 rows in set (0.02 sec)

This is an efficient search because the table defines an index on the CountryCode column. To continue the comparison of the relational database versus Redis, we will need to execute the same query against the stored Hashes. For this demonstration, we will assume that we have migrated the city table to Hashes in the Redis server. By design, Redis has no secondary indexing feature for any of the core data structures, which means that we should scan all the Hashes prefixed by the “city:” namespace, then read the city name from every Hash and check whether it matches our search term. The following example performs a non-blocking scan of the keyspace, filtering on the key name (“city:*”) in batches of configurable size (three, in the example):

127.0.0.1:6379> SCAN 0 MATCH city:* COUNT 3
1) "512"
2) 1) "city:4019"
   2) "city:9"
   3) "city:103"

The client should now extract the CountryCode value from every city, compare it to the search term, and repeat until the scan is concluded. This is obviously a time-consuming and expensive approach. There are ways to improve the efficiency of such batched operations. We will explore three standard options and then show how to resolve the problem using the Redis Stack capabilities:

Pipelining
Using functions
Using indexes
Redis Stack capabilities

We will look at these in detail next.

Pipelining

The first approach to reducing the overhead of the search operation is to use pipelining, which is supported by all major client libraries. Pipelining collects a batch of commands, delivers them to the server, and collects the outputs from the server immediately before returning the result to the client. This option dramatically reduces the latency of the overall operation, as it saves on the roundtrip time to the server (an analogy that works is going to the supermarket once to purchase 30 items rather than going 30 times and purchasing one item on every visit). The pros and cons of pipelining are as follows:

Pros: Saves on roundtrip time and does not block the server, as the server executes a batch of commands and returns the results to the client. Therefore, it increases overall system throughput. Pipelining is especially useful when batching operations.
Cons: The complexity of the operation is proportional to the number and complexity of the operations in the pipeline that are executed by the server. This may increase the memory usage on the server, as it keeps the intermediate results in memory until all commands in the pipeline are processed. The client manages multiple responses, which adds complexity to its business logic, especially when it has to deal with errors of some operations in the pipeline.

Using functions

Lua scripting and functions (functions were introduced in Redis 7.0 and represent an evolution of Lua scripting for remote server execution) help to offload the client and remove network latency. The search is local to the server and close to the data (equivalent to the concept of stored procedures). The following function is an example of local search:

#!lua name=mylib
local function city_by_cc(keys, args)
   local match, cursor = {}, "0";
   repeat
      local ret = redis.call("SCAN", cursor, "MATCH", "city:*", "COUNT", 100);
      local cities = ret[2];
        for i = 1, #cities do
         local keyname = cities[i];
         local ccode = redis.call('HMGET',keyname,'Name','CountryCode')
         if ccode[2] == args[1] then
            match[#match + 1] = ccode[1];
         end;
        end;
        cursor = ret[1];
      until cursor == "0";
   return match;
end
redis.register_function('city_by_cc', city_by_cc)

In this function, we do the following:

We perform a scan of the entire keyspace, filtering by the “city:*” prefix, which means that we will iterate through all the keys in the Redis server database.
For every key returned by the SCAN command, we retrieve the name and CountryCode of the city using the HMGET command.
If CountryCode matches our search filter, we add the city to an output array.
When the scan is completed, we return the array to the client.

Type the code into the mylib.lua file and import the library as follows:

cat mylib.lua | redis-cli -x FUNCTION LOAD

The function can be invoked using the following command:

127.0.0.1:6379> FCALL city_by_cc 0 "ESP"
 1) "A Coru\xf1a (La Coru\xf1a)"
 2) "Almer\xeda"
[...]
59) "Barakaldo"

The pros and cons of using functions are as follows:

Pros: The operation is executed on the server, and the client does not experiment with any overhead.
Cons: The complexity of the operation is linear, and the function (like any other Lua script or function) blocks the server. Any other concurrent operation must wait until the execution of the function is completed. Long scans make the server appear stuck to other clients.

Using indexes

Data scans, wherever they are executed (client or server side), are slow and ineffective in satisfying real-time requirements. This is especially true when the keyspace stores millions of keys or more. An alternative approach for search operations using the Redis core data structures is to create a secondary index. There are many options to do this using Redis collections. As an example, we can create an index of Spanish cities using a Set as follows:

SADD city:esp "Sevilla" "Madrid" "Barcelona" "Valencia" "Bilbao" "Las Palmas de Gran Canaria"

This data structure has interesting properties for our needs. We can retrieve all the Spanish cities in a single command:

127.0.0.1:6379> SMEMBERS city:esp
1) "Madrid"
2) "Sevilla"
3) "Valencia"
4) "Barcelona"
5) "Bilbao"
6) "Las Palmas de Gran Canaria"

Or we can check whether a specific city is in Spain using SISMEMBER, a constant time-complexity command:

127.0.0.1:6379> SISMEMBER city:esp "Madrid"
(integer) 1

And we can even search the index for cities having a name that matches a pattern:

127.0.0.1:6379> SSCAN city:esp 0 MATCH B*
1) "0"
2) 1) "Barcelona"
   2) "Bilbao"

We can refine our search requirements and design an index that considers the population. In such a case we could use a Sorted Set and Set the population as the score:

127.0.0.1:6379> ZADD city:esp 2879052 "Madrid" 701927 "Sevilla" 1503451 "Barcelona" 739412 "Valencia" 357589 "Bilbao" 354757 "Las Palmas de Gran Canaria"
(integer) 6

The main feature of the Sorted Set data structure is that its members are stored in an ordered tree-like structure (Redis uses a skiplist data structure), and with that, it is possible to execute low-complexity range searches. As an example, let’s retrieve Spanish cities with more than 2 million inhabitants:

127.0.0.1:6379> ZRANGE city:esp 2000000 +inf BYSCORE
1) "Madrid"

We can also check whether a city belongs to the index of Spanish cities:

127.0.0.1:6379> ZRANK city:esp Madrid
(integer) 5

In the former example, the ZRANK command informs us that the city Madrid belongs to the index and is fifth highest in the ranking. This solution resolves the overhead caused by having to scan the entire keyspace looking for matches.

The drawback of such a manual approach to indexing the data is that indexes need to reflect the data at any time. Considering scenarios where we want to add or remove a city from our database, we need to perform the two operations of removing the city Hash and updating the index, atomically. We can use a Redis transaction to perform atomic changes on both the data and the index:

127.0.0.1:6379> MULTI
OK
127.0.0.1:6379(TX)> DEL city:653
QUEUED
127.0.0.1:6379(TX)> ZREM city:esp "Madrid"
QUEUED
127.0.0.1:6379(TX)> EXEC
1) (integer) 1
2) (integer) 1

Custom secondary indexes come at a price, though, because complex searches become hard to manage using multiple data structures. Indexes must be maintained, and the complexity of such solutions may get out of hand, putting the consistency of search operations at risk. The pros and cons of using indexing are as follows:

Pros: Simple and fast search operations are possible using Redis core data structures to create a secondary index
Cons: The secondary index needs to be maintained, and search operations on multiple fields (what is called a composite index in relational databases) are not immediate and need thoughtful planning, implementation, and maintenance

Next, we will examine the capabilities of Redis Stack.

Redis Stack capabilities

Caching is one of the frequent use cases for which Redis shines as the best-in-class storage solution. This is because it stores data in memory, and offers real-time performance. It is also lightweight, as data structures are optimized to consume little memory. Redis does not need any complex configuration or maintenance and it is open source, so there is no reason not to give it a try. As a real-time data storage, it seems plausible that complex search operations may not be the primary use case users are interested in when using Redis. After all, fast retrieval of data by key is what made Redis so versatile as a cache or as a session store.

However, if in addition to the ability to use core data structures to store the data, we ensure that fast searches can be performed (besides primary key lookup), it is possible to think beyond the basic caching use case and start looking at Redis as a full-fledged database, capable of high-speed searches.

So far, we have presented simple and common search problems and both solutions using the traditional SQL approach and possible data modeling strategies using Redis core data structures. In the following sections, we will show how Redis Stack resolves query and search use cases and extends the core features of Redis with an integrated modeling and developing experience. We will introduce the following capabilities:

Querying, indexing, and searching documents
Time series data modeling
Probabilistic data structures
Programmability

Let’s discuss each of these capabilities in detail.

Querying, indexing, and searching documents

Redis Stack complements Redis with the ability to create secondary indexes on Hashes or JSON documents, the two document types supported by Redis Stack. The search examples seen so far can be resolved with the indexing features. To perform an indexed search, we create an index against the hashes modeling the cities using the following syntax:

FT.CREATE city_idx
ON HASH
PREFIX 1 city:
SCHEMA Name AS name TEXT
CountryCode AS countrycode TAG SORTABLE
Population AS population NUMERIC SORTABLE

The FT.CREATE command instructs the server to perform the following operations:

Create an index for the desired values of the Hash document.
Scan the keyspace and retrieve the documents prefixed by the “hash:” string.
Create the index corresponding to the desired data structure and, as specified by the FT.CREATE command, the Hash in this case. The indexes defined in this example are of the following types:
- TEXT, which enables full-text search on the Name field
- TAG SORTABLE, which enables an exact-match search against the CountryCode field and enables high-performance sorting by the value of the attribute
- NUMERIC SORTABLE, which enables range queries against the Population field and enables high-performance sorting by the value of the attribute

As soon as the indexing operation against the relevant data – all the keys prefixed by “hash:”– is completed, we can execute the queries and searches seen so far, and more. The syntax in the following example executes a search of all the cities with the value “ESP” in the TAG field type and returns only the name of the cities, sorted in lexicographical order. Finally, the first three results are returned using the LIMIT option. Note that this query is executed against the new city_idx index, and not directly against the data:

127.0.0.1:6379> FT.SEARCH city_idx '@countrycode:{ESP}' RETURN 1 name SORTBY name LIMIT 0 3
1) (integer) 59
2) "city:670"
3) 1) "name"
   2) "A Coru\xc3\xb1a (La Coru\xc3\xb1a)"
4) "city:690"
5) 1) "name"
   2) "Albacete"
6) "city:687"
7) 1) "name"
   2) "Alcal\xc3\xa1 de Henares"

It is possible to combine several textual queries/filters in the same index. Using exact-match and full-text search, we can verify whether Madrid is a Spanish city:

127.0.0.1:6379> FT.SEARCH city_idx '@name:Madrid @countrycode:{ESP}' RETURN 1 name
1) (integer) 1
2) "city:653"
3) 1) "name"
   2) "Madrid"

In a previous example, the range search was executed using the ZRANGE data structure. Using the indexing capability of Redis Stack, we can execute range searches using the NUMERIC field type. So, if we want to retrieve the Spanish cities with more than 2 million inhabitants, we will write the following search query:

127.0.0.1:6379> FT.SEARCH city_idx '@countrycode:{ESP}' FILTER population 2000000 +inf RETURN 1 name
1) (integer) 1
2) "city:653"
3) 1) "name"
   2) "Madrid"

Redis Stack offers flexibility and concise syntax to combine several field types, of which we have seen only a limited but representative number of examples. Once the index is created, the user can go ahead and use it, and add new documents or update existing ones. The database maintains the indexes updated synchronously as soon as documents are created or changed.

Besides full-text, exact-match, and range searches, we can also perform data aggregation (as we would in a relational database using the GROUP BY statement). If we would like to retrieve the three most populated countries, sorted in descending order, we would solve the problem in SQL as follows:

SELECT CountryCode,
SUM(Population) AS sum
FROM city
GROUP BY CountryCode
ORDER BY sum DESC
LIMIT 3;
+-------------+-----------+
| CountryCode | sum       |
+-------------+-----------+
| CHN         | 175953614 |
| IND         | 123298526 |
| BRA         |  85876862 |
+-------------+-----------+
3 rows in set (0.01 sec)

We can perform complex aggregations with the FT.AGGREGATE command. Using the following command, we can perform a real-time search and aggregation to compute the total population of the top three countries by summing up the inhabitants of the cities per country:

127.0.0.1:6379> FT.AGGREGATE city_idx * GROUPBY 1 @countrycode REDUCE SUM 1 @population AS sum SORTBY 2 @sum DESC LIMIT 0 3
1) (integer) 232
2) 1) "countrycode"
   2) "chn"
   3) "sum"
   4) "175953614"
3) 1) "countrycode"
   2) "ind"
   3) "sum"
   4) "123298526"
4) 1) "countrycode"
   2) "bra"
   3) "sum"
   4) "85876862"

To summarize this brief introduction where we addressed the search and aggregation capabilities, it is worth mentioning that there are multiple types of searches, such as phonetic matching, auto-completion suggestions, geo searches, or a spellchecker to help design great applications. We will cover them in depth in Chapter 5, Redis Stack as a Document Store, where we showcase Redis Stack as a document store.

Besides modeling objects as Hash, it is possible to store, update, and retrieve JSON documents. The JSON format needs no introduction, as it permeates data pipelines including heterogeneous subsystems, protocols, databases, and so on. Redis Stack delivers this capability out of the box and manages JSON documents in a similar way to Hashes, which means that it is possible to store, index, and search JSON objects and work with them using JSONPath syntax:

To illustrate the syntax to store, search, and retrieve JSON data along the lines of the previous examples, let’s store city objects formatted as JSON:

JSON.SET city:653 $ '{"Name":"Madrid", "CountryCode":"ESP", "District":"Madrid", "Population":2879052}'
JSON.SET city:5 $ '{"Name":"Amsterdam", "CountryCode":"NLD", "District":"Noord-Holland", "Population":731200}'
JSON.SET city:1451 $ '{"Name":"Tel Aviv-Jaffa", "CountryCode":"ISR", "District":"Tel Aviv", "Population":348100}'

We don’t need anything else to start working with the JSON documents stored in Redis Stack. We can then perform basic retrieval operations on entire documents:
```
127.0.0.1:6379> JSON.GET city:653
"{\"Name\":\"Madrid\",\"CountryCode\":\"ESP\",\"District\":\"Madrid\",\"Population\":2879052}"
```
We can also retrieve the desired property (or multiple properties at once) stored on a certain path, with fast access guaranteed, because the document is stored in a tree structure:
```
127.0.0.1:6379> JSON.GET city:653 $.Name
"[\"Madrid\"]"
127.0.0.1:6379> JSON.GET city:653 $.Name $.CountryCode
"{\"$.Name\":[\"Madrid\"],\"$.CountryCode\":[\"ESP\"]}"
```
As we have seen for Hash documents, we can index JSON documents using a similar syntax and perform search operations. The following command creates an index for all the JSON documents with the city: prefix in the database:
```
FT.CREATE city_idx ON JSON PREFIX 1 city: SCHEMA $.Name AS name TEXT $.CountryCode AS countrycode TAG SORTABLE $.Population AS population NUMERIC SORTABLE
```

And using the FT.SEARCH command with an identical syntax as seen for the Hash documents, we can perform search operations:

127.0.0.1:6379> FT.SEARCH city_idx '@countrycode:{ESP}' FILTER population 2000000 +inf RETURN 1 name
1) (integer) 1
2) "city:653"
3) 1) "name"
   2) "Madrid"

Unlike Hash documents, the JSON supports nested levels (up to 128) and can store properties, objects, arrays, and geographical locations at any level in a tree-like structure, so the JSON format opens up a variety of use cases using a compact and flexible data structure.

Time series data modeling

Time series databases do not need any long introduction: they are data structures that can store data points happening at a certain time, indicated by a Unix timestamp expressed in milliseconds, with an associated numeric data value, typically with double precision. This data structure applies to many use cases, such as monitoring entities over time or tracking user activities for a determined service. Redis Stack has an integrated time series database that offers many useful features to manage the data points, for querying and searching, and provides convenient formatting commands for data processing and visualization. Beginning with time series modeling is straightforward:

We can create a time series from the command-line interface (or from any of the client libraries that support time series):
```
TS.CREATE "app:monitor:temp"
```

Storing samples into the time series can be done with the TS.ADD command. If we would like to store the temperature measured by the sensor of a meteorological station captured every few seconds, the commands would be as follows:

127.0.0.1:6379> "TS.ADD" "app:monitor:temp" "*" "20"
(integer) 1675632813307
127.0.0.1:6379> "TS.ADD" "app:monitor:temp" "*" "20"
(integer) 1675632818179
127.0.0.1:6379> "TS.ADD" "app:monitor:temp" "*" "20"
(integer) 1675632824174
127.0.0.1:6379> "TS.ADD" "app:monitor:temp" "*" "20.1"
(integer) 1675632829519
127.0.0.1:6379> "TS.ADD" "app:monitor:temp" "*" "20"
(integer) 1675632835052

We are instructing the database to insert the sample at the current time, so we specify the * argument. We can finally retrieve the samples stored in the time series for the desired interval:

127.0.0.1:6379> "TS.RANGE" "app:monitor:temp" "1675632818179" "1675632829519"
1) 1) (integer) 1675632818179
   2) 20
2) 1) (integer) 1675632824174
   2) 20
3) 1) (integer) 1675632829519
   2) 20.1

We have just scratched the surface of using time series with Redis Stack, because data may be aggregated, down-sampled, and indexed to address many different uses.

Probabilistic data structures

Deterministic data structures – all those structures that store and return the same data that was stored (such as Strings, Sets, Hashes, and the rest of Redis structures) – are a good solution for standard amounts of data, but they may become inadequate due to the constantly growing volumes of data that systems must handle. Redis offers several options to store and present data to extract different types of insights. Strings are an example because they can be encoded as integers and used as counters:

127.0.0.1:6379> INCR cnt
(integer) 1
127.0.0.1:6379> INCRBY cnt 3
(integer) 4

Strings can also be managed down to the bit level to store multiple integer counters of variable length and stored at different offsets of a single string to reduce storage overheads using the bitfield data structure:

127.0.0.1:6379> BITFIELD cnt INCRBY i5 0 5
1) (integer) 5
127.0.0.1:6379> BITFIELD cnt INCRBY i5 0 5
1) (integer) 10
127.0.0.1:6379> BITFIELD cnt GET i5 0
1) (integer) 10

Regular counters, sets, and hash tables perform well for any amount of data but handling large amounts of data represents a challenge to scale the resources of the machine where Redis Stack is running, because of its memory requirements.

Deterministic data structures have given way to probabilistic data structures because of the need to scale up to large quantities of data and give a reasonably approximated answer to questions such as the following:

How many different pages has the user visited so far?
What are the top players with the highest score?
Has the user already seen this ad?
How many unique values have appeared so far in the data stream?
How many values in the data stream are smaller than a given value?

In the attempt to give an answer to the first question in the list, we could calculate the hash of the URL of the visited page and store it in a Redis collection, such as a Set, and then retrieve the cardinality of the structure using the SCARD command. While this solution works very well (and is deterministically exact), scaling it to many users and many visited pages represents a cost.

Let’s consider an example with a probabilistic data structure. HyperLogLog estimates the cardinality of a set with minimal memory usage and computational overhead without compromising the accuracy of the results, while consuming only a fraction of memory and CPU, so you would count the visited pages and get an estimation as follows:

127.0.0.1:6379> PFADD pages "https://redis.com/" "https://redis.io/docs/stack/bloom/" "https://redis.io/docs/data-types/hyperloglogs/"
(integer) 1
127.0.0.1:6379> PFCOUNT pages
(integer) 3

Redis reports the following memory usage for HyperLogLog:

127.0.0.1:6379> MEMORY USAGE pages
(integer) 96

Attempting to resolve the same problem using a Set and storing the hashes for these URLs would be done as follows:

127.0.0.1:6379> SADD hashpages "522195171ed14f78e1f33f84a98f0de6" "f5518a82f8be40e2994fdca7f71e090d" "c4e78b8c136f6e1baf454b7192e89cd1"
(integer) 3
127.0.0.1:6379> MEMORY USAGE hashpages
(integer) 336

Probabilistic data structures trade accuracy for time and space efficiency and give an answer to this and other questions by addressing several data analysis problems against big amounts of data and, most relevantly, efficiently.

Programmability

Redis Stack embeds a serverless engine for event-driven data processing allowing users to write and run their own functions on data stored in Redis. The functions are implemented in JavaScript and executed by the engine upon user invocation or in response to events such as changes to data, execution of commands, or when events are added to a Redis Stream data structure. It is also possible to configure timed executions, so periodical maintenance operations can be scheduled.

Redis Stack minimizes the execution time by running the functions as close as possible to the data, improving data locality, minimizing network congestion, and increasing the overall throughput of the system.

With this capability, it is possible to implement event-driven data flows, thus opening the doors to many use cases, such as the following:

A basic library including a function can be implemented in text files, as in the following snippet:

#!js api_version=1.0 name=lib
redis.registerFunction('hello', function(){
    return 'Hello Gears!';
});

The lib.js file containing this function can then be imported into Redis Stack:
```
redis-cli -x TFUNCTION LOAD < ./lib.js
```
It can then be executed on demand from the command-line interface:
```
127.0.0.1:6379>  TFCALL lib.hello 0
"Hello Gears!"
```
Things become more interesting if we subscribe to data changes as follows:
```
redis.registerKeySpaceTrigger("key_logger", "user:", function(client, data){
    if (data.event == 'del'){
        client.call("INCR", "removed");
        redis.log(JSON.stringify(data));
        redis.log("A user has been removed");
    }
});
```
In this function, we do the following:
- We are subscribing to events against the keys prefixed by the “user:” namespace
- We check the command that triggered the event, and if it is a deletion, we act and specify what’s going to happen next
- The triggered action will be the increment of a counter, and it will also write a message into the server’s log

To test this function, we proceed to create and delete a user profile:

127.0.0.1:6379> HSET user:123 name "John" last "Smith"
(integer) 2
127.0.0.1:6379> DEL user:123
(integer) 1

A quick check of the server’s log verifies that the condition has been met, and the information logged:

299:M 05 Feb 2023 19:13:09.004 * <redisgears_2> {"event":"del","key":"user:123","key_raw":{}}
299:M 05 Feb 2023 19:13:09.005 * <redisgears_2> A user has been removed

And the counter has increased:

127.0.0.1:6379> GET removed
"1"

Through this book, we will come to understand the differences between Lua scripts, Redis functions, and JavaScript functions, and we will explore the many possible programmability features along with proposals to resolve challenging problems with simple solutions.

So, what is Redis Stack?

Redis Stack combines the speed and stability of the Redis server with a set of well-established capabilities and integrates them into a compact solution that is easy to install and manage – Redis Stack Server. The RedisInsight desktop application is a visualization tool and data manager that complements Redis Stack Server with a set of functionalities useful for visualizing data stored by different models as well as providing interactive tutorials with popular examples, and more.

To complete the picture, the Redis Stack Client SDK includes the most popular client libraries to develop against Redis Stack in the Java, Python, and JavaScript programming languages.

Figure 1.1 – The Redis Stack logo

Redis Stack empowers users with the liberty to use it for free in development and production environments and merges the open source BSD-licensed Redis with search and query capabilities, JSON support, time series handling, and probabilistic data structures. It is available under a dual license, specifically the Redis Source Available License (RSALv2) and the Server Side Public License (SSPL).

So, in a few examples, we have introduced new possibilities to modernize applications, and now we owe you an answer to the original question, “What is Redis Stack?”

Key-value storage

To define what Redis Stack is, we need to go back for a moment to its origins, because Redis is the spinal cord of Redis Stack. Redis was born as in-memory storage to accelerate massive amounts of queries and achieve sub-millisecond latency while optimizing memory usage and maximizing the ease of adoption and administration. It appeared at the same time as other solutions taking part in the NoSQL wave and deviating from relational modeling. While the key-value Memcached store was an already established solution, Redis became popular too as a type of key-value storage. So, we can surely say that Redis Stack can be used as a key-value store.

Data structure server

However, considering Redis Stack as a simple key-value data store is reductive. Redis is best known for its flexibility in storing collections such as Hashes, Sets, Sorted Sets or Lists, Bitmaps and Bitfields, Streams, HyperLogLog probabilistic data structures, and geo indexes. And, together with data structures, its efficient low-complexity algorithms make storing and searching data a joy for developers. We can certainly say that Redis Stack is also a data structure store.

Multi-model database

The features introduced so far are integrated into Redis Stack Server and extend the Redis server, turning the data structure server into a multi-model database. This provides a rich data modeling experience where multiple heterogeneous data structures such as documents, vectors, and time series coexist in the same database. Software architects will appreciate the variety of possibilities for designing new solutions without multiple specialized databases and software developers will be empowered with a rich set of client libraries that improve the ease of software design. Database administrators will discover how shallow the learning curve is to learn to administer a single database rather than installing, configuring, and maintaining several data stores.

Data platform

The characteristics discussed so far, together with stream processing and the possibility to execute JavaScript functions for event-driven development, push Redis Stack beyond the boundaries of the multi-model database definition. Combining Redis, the key-value data store that is popular as a cache, with advanced data structures and multi-model design, and with the capability of a message broker with event-driven programming features, turns Redis Stack into a powerful data platform.

We have completed the Redis Stack walk-through, and to conclude this chapter, we will briefly discuss how to install it using different methods.

Gabriel Cerioni Mar 12, 2024

Redis Stack for Application Modernization is an exceptional guide that stands out in the tech publication sphere. This book not only demystifies the advanced features of Redis Stack but does so with an eloquent, hands-on approach that empowers developers and architects alike to build scalable, multi-model applications with ease.Right from the start, the book dives into the multi-model capabilities of Redis Stack, breaking down complex topics like vector databases and probabilistic data structures into digestible, easily understandable sections. The key feature discussions are a highlight, presenting modern solutions such as vector similarity searches and document hybrid searches, which are game-changers for application development.What I found particularly beneficial were the real-world examples that tied theory to practice, making application modernization a tangible goal rather than a daunting task. The book's clear instructions on configuring a scalable and secure server using RedisInsight give it a practical edge that is often missing in other tech books.Additionally, the inclusive approach of integrating Redis Stack with popular programming languages like Java, Python, C#, Golang, and Node.js ensures that no reader is left behind. Whether you're looking to sharpen your existing database skills or venture into new territories with microservices and stream processing features, this book is a comprehensive guide that covers it all.By the end, I was not only educated on the administrative best practices for managing a Redis server but was also equipped with actionable insights on ensuring scalability and high availability. It's evident that this book is crafted for those who aim to stay at the forefront of database technology and application development.In conclusion, Redis Stack for Application Modernization is a must-read for any professional looking to modernize applications at scale. It’s a well-rounded resource that earns a solid five stars for its breadth of knowledge, clarity of writing, and practical applicability.

Amazon Verified review

Owen Taylor Mar 21, 2024

The style is human-friendly, with many examples and good reasoning for identifying why a particular strategy is applicable or not.Redis has grown to be much much more than a simple key/value store, this book shines light on all that is now possible.

Redis Stack for Application Modernization: Build real-time multi-model applications at any scale with Redis

What do you get with Print?

Redis Stack for Application Modernization

Introducing Redis Stack

Technical requirements

Exploring the history of Redis

The open source project

From key-value to multi-model real-time databases

Primary key lookup

Secondary key lookup

Pipelining

Using functions

Using indexes

Redis Stack capabilities

Querying, indexing, and searching documents

Time series data modeling

Probabilistic data structures

Programmability

So, what is Redis Stack?

Key-value storage

Data structure server

Multi-model database

Data platform

Redis Stack deployment types

Summary

Page 1 of 7

Key benefits

Description

Who is this book for?

What you will learn

Product Details

What do you get with Print?

Product Details

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

People who bought this also bought

About the authors

FAQs

Redis Stack for Application Modernization: Build real-time multi-model applications at any scale with Redis

What do you get with Print?

Contact Details

Shipping Address

Billing Address

Key benefits

Description

Who is this book for?

What you will learn

Product Details

What do you get with Print?

Contact Details

Shipping Address

Billing Address

Product Details

Packt Subscriptions

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

People who bought this also bought

About the authors

FAQs

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access