In this book, we will concentrate on two specific subprojects that offer support for Java Persistence API 2.0 and the Redis key-value store. But before we get to the point, we need to get a brief introduction to both the technologies. We need to do this for two reasons:
First, if we want to truly understand the benefits of Spring Data JPA, we need to have an idea on how database queries are created when the standard API is used. As soon as we compare these code samples to a query creation code that uses Spring Data JPA, its benefits are revealed to us.
Second, the basic knowledge about the Redis key-value store will help us to understand the second part of this book which describes how we can use it in our applications. After all, we should be familiar with any technology that we use in our applications. Right?
In this chapter, we will cover the following topics:
The motivation behind the Java Persistence API
The main components of the Java Persistence API
How we can create database queries with the Java Persistence API
The data types supported by the Redis key-value store.
The main features of the Redis key-value store.
Before the Java Persistence API (JPA) was introduced, we had the following three alternative technologies which we could use to implement our persistence layer:
This gave us some freedom when selecting the best tool for the job but as always, none of these options were problem free.
The problem with EJB 2.x was that it was too heavyweight and complicated. Its configuration relied on complicated XML documents and its programming model required a lot of boilerplate code. Also, EJB required that the application be deployed to a Java EE application server.
Programming against the JDBC API was rather simple and we could deploy our application in any servlet container. However, we had to write a lot of boilerplate code that was needed when we were transforming the information of our domain model to queries or building domain model objects from query results.
Third party ORM frameworks were often a good choice because they freed us from writing the unnecessary code that was used to build queries or to construct domain objects from query results. This freedom came with a price tag: objects and relational data are not compatible creatures, and even though ORM frameworks can solve most of the problems caused by the object-relational mismatch , the problems that they cannot solve efficiently are the ones that cause us the most pain.
The Java Persistence API provides a standard mechanism for implementing a persistence layer that uses relational databases. Its main motivation was to replace the persistence mechanism of EJB 2.x and to provide a standardized approach for object-relational mapping. Many of its features were originally introduced by the third party ORM frameworks, which have later become implementations of the Java Persistence API. The following section introduces its key concepts and describes how we can create queries with it.
An entity is a persistent domain object. Each entity class generally represents a single database table, and an instance of such a class contains the data of a single table row. Each entity instance always has a unique object identifier, which is the same thing to an entity that a primary key is to a database table.
An entity manager factory
creates entity manager
instances. All entity manager instances created by the same entity manager factory will use the same configuration and database. If you need to access multiple databases, you must configure one entity manager factory per used database. The methods of the entity manager factory are specified by the EntityManagerFactory
interface.
The entity manager manages the entities of the application. The entity manager can be used to perform CRUD (Create, Read, Updated, and Delete) operations on entities and run complex queries against a database. The methods of an entity manager are declared by the EntityManager
interface.
A persistence unit specifies all entity classes, which are managed by the entity managers of the application. Each persistence unit contains all classes representing the data stored in a single database.
A persistence context contains entity instances. Inside a persistence context, there must be only one entity instance for each object identifier. Each persistence context is associated with a specific entity manager that manages the lifecycle of the entity instances contained by the persistence context.
The Java Persistence API introduced two new methods for creating database queries: Java Persistence Query Language (JPQL) and the Criteria API . The queries written by using these technologies do not deal directly with database tables. Instead, queries are written over the entities of the application and their persistent state. This ensures, in theory, that the created queries are portable and not tied to a specific database schema or database provider.
It is also possible to use SQL queries, but this ties the application to a specific database schema. If database provider specific extensions are used, our application is tied to the database provider as well.
Next we will take a look at how we can use the Java Persistence API to build database queries by using SQL, JPQL, and the Criteria API. Our example query will fetch all contacts whose first name is "John" from the database. This example uses a simple entity class called Contact
that represents the data stored in the contacts
table. The following table maps the entity's properties to the columns of the database:
Contact |
contacts |
---|---|
|
|
SQL is a standardized query language that is designed to manage data that is stored in relational databases. The following code example describes how we can implement the specified query by using SQL:
//Obtain an instance of the entity manager EntityManager em = ... //Build the SQL query string with a query parameter String getByFirstName="SELECT * FROM contacts c WHERE c.first_name = ?1"; //Create the Query instance Query query = em.createNativeQuery(getByFirstName, Contact.class); //Set the value of the query parameter query.setParameter(1, "John"); //Get the list of results List contacts = query.getResultList();
This example teaches us three things:
We don't have to learn a new query language in order to build queries with JPA.
The created query is not type safe and we must cast the results before we can use them.
We have to run the application before we can verify our query for spelling or syntactical errors. This increases the length of the developer feedback loop and decreases productivity.
Because SQL queries are tied to a specific database schema (or to the used database provider), we should use them only when it is absolutely necessary. Often the reason for using SQL queries is performance, but we might also have other reasons for using it. For example, we might be migrating a legacy application to JPA and we don't have time to do it right at the beginning.
JPQL is a string-based query language with a syntax resembling that of SQL. Thus, learning JPQL is fairly easy as long as you have some experience with SQL. The code example that executes the specified query is as follows:
//Obtain an instance of the entity manager EntityManager em = ... //Build the JPQL query string with named parameter String getByFirstName="SELECT c FROM Contact c WHERE c.firstName = :firstName"; //Create the Query instance TypedQuery<Contact> query = em.createQuery(getByFirstName, Contact.class); //Set the value of the named parameter query.setParameter("firstName", "John"); //Get the list of results List<Contact> contacts = query.getResultList();
This example tells us three things:
The created query is type safe and we don't have to cast the query results.
The JPQL query strings are very readable and easy to interpret.
The created query strings cannot be verified during compilation. The only way to verify our query strings for spelling or syntactical errors is to run our application. Unfortunately, this means that the length of the developer feedback loop is increased, which decreases productivity.
JPQL is a good choice for static queries. In other words, if the number of query parameters is always the same, JPQL should be our weapon of choice. But implementing dynamic queries with JPQL is often cumbersome as we have to build the query string manually.
The Criteria API was introduced to address the problems found while using JPQL and to standardize the criteria efforts of third party ORM frameworks. It is used to construct query definition objects, which are transformed to the executed SQL query. The next code example demonstrates that we can implement our query by using the Criteria API:
//Obtain an instance of entity manager EntityManager em = ... //Get criteria builder CriteriaBuilder cb = em.getCriteriaBuilder(); //Create criteria query CriteriaQuery<Contact> query = cb.greateQuery(Contact.class); //Create query root Root<Contact> root = query.from(Contact.class); //Create condition for the first name by using static meta //model. You can also use "firstName" here. Predicate firstNameIs = cb.equal(root.get(Contact_.firstName, "John"); //Specify the where condition of query query.where(firstNameIs); //Create typed query and get results TypedQuery<Contact> q = em.createQuery(query); List<Contact> contacts = q.getResultList();
We can see three things from this example:
The created query is type safe and results can be obtained without casting
The code is not as readable as the corresponding code that uses SQL or JPQL
Since we are dealing with a Java API, the Java compiler ensures that it is not possible to create syntactically incorrect queries
The Criteria API is a great tool if we have to create dynamic queries. The creation of dynamic queries is easier because we can deal with objects instead of building query strings manually. Unfortunately, when the complexity of the created query grows, the creation of the query definition object can be troublesome and the code becomes harder to understand.
Redis is an in-memory data store that keeps its entire data set in a memory and uses disk space only as a secondary persistent storage. Therefore, Redis can provide very fast read and write operations. The catch is that the size of the Redis data set cannot be higher than the amount of memory. The other features of Redis include:
Support for complex data types
Multiple persistence mechanisms
Master-slave replication
Implementation of the publish/subscribe messaging pattern
These features are described in the following subsections.
Each value stored by Redis has a key. Both keys and values are binary safe, which means that the key or the stored value can be either a string or the content of a binary file. However, Redis is more than just a simple key-value store. It supports multiple binary safe data types, which should be familiar to every programmer. These data types are as follows:
String: This is a data type where one key always refers to a single value.
List: This is a data type where one key refers to multiple string values, which are sorted in insertion order.
Set: This is a collection of unordered strings that cannot contain the same value more than once.
Sorted set: This is similar to a set but each of its values has a score which is used to order the values of a sorted set from the lowest score to the highest. The same score can be assigned to multiple values.
Hash: This is a data type where a single hash key always refers to a specific map of string keys and values.
Redis supports two persistence mechanisms that can be used to store the data set on disk. They are as follows:
RDB is the simplest persistence mechanism of Redis. It takes snapshots from the in-memory data sets at configured intervals, and stores the snapshot on disk. When a server is started, it will read the data set back to the memory from the snapshot file. This is the default persistence mechanism of Redis.
RDB maximizes the performance of your Redis server, and its file format is really compact, which makes it a very useful tool for disaster recovery. Also, if you want to use the master-slave replication, you have to use RDB because the RDB snapshots are used when the data is synchronized between the master and the slaves.
However, if you have to minimize the chance of data loss in all situations, RDB is not the right solution for you. Because RDB persists the data at configured intervals, you can always lose the data stored in to your Redis instance after the last snapshot was saved to a disk.
Append Only File (AOF) is a persistence model, which logs each operation changing the state of the in-memory data set to a specific log file. When a Redis instance is started, it will reconstruct the data set by executing all operations found from the log file.
The advantage of the AOF is that it minimizes that chance of data loss in all situations. Also, since the log file is an append log, it cannot be irreversibly corrupted. On the other hand, AOF log files are usually larger than RDB files for the same data, and AOF can be slower than RDB if the server is experiencing a huge write load.
You can also enable both persistence mechanisms and get the best of both worlds. You can use RDB for creating backups of your data set and still ensure that your data is safe. In this case, Redis will use the AOF log file for building the data set on a server startup because it is most likely that it contains the latest data.
If you are using Redis as a temporary data storage and do not need persistency, you can disable both persistence mechanisms. This means that the data sets will be destroyed when the server is shut down.
Redis supports master-slave replication where a single master can have one or multiple slaves. Each slave is an exact copy of its master, and it can connect to both master and other slaves. In other words, a slave can be a master of other slaves. Since Redis 2.6, each slave is read-only by default, and all write operations to a slave are rejected. If we need to store temporary information to a slave, we have to configure that slave to allow write operations.
Replication is non-blocking on both sides. It will not block the queries made to the master even when a slave or slaves are synchronizing their data for the very first time. Slaves can be configured to serve the old data when they are synchronizing their data with the master. However, incoming connections to a slave will be blocked for a short period of time when the old data is replaced with the new data.
If a slave loses connection to the master, it will either continue serving the old data or return an error to the clients, depending on its configuration. When a connection between master and a slave is lost, the slave will automatically reopen the connection and send a synchronization request to the master.
The publish/subscribe messaging pattern is a messaging pattern where the message sender (publisher) does not send messages directly to the receiver (subscriber). Instead, an additional element called a channel is used to transport messages from the publisher to the subscriber. Publishers can send a message to one or more channels. Subscribers can select the interesting channels and receive messages sent to these channels by subscribing to those channels.
Let's think of a situation where a single publisher is publishing messages to two channels, Channel 1 and Channel 2. Channel 1 has two subscribers: Subscriber 1 and Subscriber 2. Channel 2 also has two subscribers: Subscriber 2 and Subscriber 3. This situation is illustrated in the following figure:

The publish/subscribe pattern ensures that the publishers are not aware of the subscribers and vice versa. This gives us the possibility to divide our application into smaller modules, which have loose coupling between them. This makes the modules easier to maintain and replace if needed.
However, the greatest advantage of the publish/subscribe pattern is also its greatest weakness. Firstly, our application cannot rely on the fact that a specific component has subscribed to a specific channel. Secondly, there is no clean way for us to verify if this is the case. In fact, our application cannot assume that anyone is listening.
Redis offers a solid support for the publish/subscribe pattern. The main features of its publish/subscribe implementation are:
Publishers can publish messages to one or more channels at the same time
Subscribers can subscribe to the interesting channels by using the name of the channel or a pattern containing a wildcard
Unsubscribing from channels also supports both name and pattern matching
In this chapter, we have learned that:
Java Persistence API was introduced to address the concerns related to EJB 2.x and to provide a standard approach for object-relational mapping. Its features were selected from the features of the most popular third party persistence frameworks.
Redis is an in-memory data store, which keeps its entire data set in memory, supports complex data types, can use disk as a persistent storage, and supports master-slave replication. It also has an implementation of the publish/subscribe messaging pattern.
In the next chapter we will learn how we can set up a web application project that uses Spring Data JPA and use it to implement a simple contact manager application.