Mastering Hibernate

Chapter 1. Entity and Session

In this chapter, we will take an in-depth look at sessions and entities and their lifecycles. It is important to understand the concepts of session and entity when we talk about design concepts, such as session-per-request, session-per-conversation, stateless sessions, and contextual sessions, and discuss the various states of an entity. After all, the only way to master anything is by paying attention to the details. We also explore entities beyond their JPA characteristics and look at Hibernate entities to see the benefits of one over the other. Furthermore, we discuss proxy objects and how they are used.

In this chapter, we will cover the following topics:

Why this book?
Quick Hibernate
Hibernate session:
- Session internals
- Contextual sessions
- Sessions per request, per conversation, and per operation
- Stateless sessions
Hibernate entities:
- Entity lifecycle
- Types of entities
- Identity crisis
- Beyond JPA
Proxy objects
Batch processing:
- Manual batch management
- Setting the size of the batch
Using a stateless session

Why this book?

Java developers solve problems using object-oriented concepts. We design applications using classes to model business entities. Furthermore, we utilize inheritance to imply that a class is another kind of class, or is composed of primitive fields and other classes, and visualize the application data as objects or object graphs. However, we also have a persistence problem.

Traditionally, the storage unit is implemented using structured records (tuples), which are stored in tables (relations) that may or may not be associated with each other. This concept is supported by a declarative language, which is limited in scope and is primarily for data creation and data manipulation. Tuples and objects have a lot in common; they both have attributes (columns and fields), and the attributes have data types (int, char, and so on), but the persistence problem becomes evident when you look at the differences between tuples and objects, such as identity, equality, or inheritance.

Object-Relational Mapping is a hard problem. Luckily, Hibernate makes this easy. You probably discovered this by reading the first few chapters of the Hibernate online documents or another book; and as you have to meet tight deadlines, you reactively solve your problems when they surface by swiftly paging through a book, searching or posting on stackoverflow, or other online forums or blogs. You spent half a day trying to find your answer and then moved on until the next problem surfaced. I have done it, you have done it; we ALL do it.

However, what if you knew about the internals of Hibernate and how it works? You wouldn't need to know everything about Hibernate, but you would know exactly where to look quickly to find your answer, such as a dictionary.

This book was written to explore the fundamental concepts of Hibernate and discuss them in detail, so that next time you run into a problem, you can identify the issue and find the answer that you want quickly. For example, whether a problem is a mapping problem or just improper use of an annotation. Furthermore, you will design better software once you understand the internals of any framework that you decide to use.

The main objectives of this book are to help you understand Hibernate beyond the basics, make you appreciate the ORM problem, and show you why Hibernate is one of the best solutions that exists today. We focus more on the Hibernate API and occasionally explore the JPA counterpart. This book assumes that you have a basic understanding of Hibernate and have used it in the past, or you are currently using it. If this is not the case for you, please visit the Hibernate documentation online, as it offers guides to get started with Hibernate and more.

Quick Hibernate

In this section, we take a glance at a typical Hibernate application and its components. Hibernate is designed to work in a standalone application as well as a Java Enterprise application, such as a Web or EJB application. All topics discussed here are covered in detail throughout this book.

The standalone version of a Hibernate application is comprised of the components that are shown in the following figure:

We, the application developers, create the components that are depicted by the white boxes, namely the data access classes, entity classes, and configuration and mapping. Everything else is provided in form of a JAR or a runtime environment.

The Hibernate session factory is responsible for creating a session when your application requests it. The factory is configured using the configuration files that you provide. Some of the configuration settings are used for JDBC parameters, such as database username, password, and connection URL. Other parameters are used to modify the behavior of a session and the factory.

You may already be familiar with the configuration file that looks like the following:

<hibernate-configuration>
    <session-factory>
        <property name="connection.driver_class">
            org.postgresql.Driver
        </property>
        <property name="connection.url">
            jdbc:postgresql://localhost:5432/packtdb
        </property>
        <property name="connection.username">user</property>
        <property name="connection.password">pass</property>
        <property name="connection.pool_size">1</property>
        <property name="dialect">
           org.hibernate.dialect.PostgreSQLDialect
        </property>
        <property name="current_session_context_class">
            thread
        </property>
        <property name="show_sql">true</property>
        <property name="format_sql">true</property>        
    </session-factory>
</hibernate-configuration>

The initialization of the session factory has changed slightly in the newer versions of Hibernate. You now have to use a service registry, as shown here:

private static SessionFactory buildSessionFactory() {
    try {
     // Create the SessionFactory from hibernate.cfg.xml
     // in resources directory
      Configuration configuration = new Configuration()
        .configure()
        .addAnnotatedClass(Person.class);
      StandardServiceRegistryBuilder builder = 
        new StandardServiceRegistryBuilder()
        .applySettings(configuration.getProperties());
      serviceRegistry = builder.build();
      return configuration
        .buildSessionFactory(serviceRegistry);
    }
    catch (Throwable ex) {
        // do something with the exception
        throw new ExceptionInInitializerError(ex);
    }
}

The data objects are represented by Hibernate entities. A simple Hibernate entity, which uses annotation, is shown here:

@Entity
public class Person {
  @Id
  @GeneratedValue
  private long id;
  private String firstname;
  private String lastname;
  private String ssn;
  private Date birthdate;
  // getters and setters
}

The last component that we have to create is the data access class, which is the service that provides the Create, Read, Update, Delete (CRUD) operations for entities. An example of storing an entity is shown here:

Session session = HibernateUtil.getSessionFactory()
  .getCurrentSession();
Transaction transaction = session.beginTransaction();

try {
 Person person = new Person();
 person.setFirstname("John");
 person.setLastname("Williams");
 person.setBirthdate(randomBirthdate());
 person.setSsn(randomSsn());
  session.save(person);
  transaction.commit();
} catch (Exception e) {
  transaction.rollback();
  e.printStackTrace();
} finally {
  if (session.isOpen())
    session.close();
}

That's it!

The Java Enterprise application doesn't look very different from the standalone version. The difference is mainly in the application stack and where each component resides, as shown in the following diagram:

This example provides a context of what we will discuss in detail throughout this book. Let's begin by taking a closer look at a Hibernate session.

Working with a session

The core operations of Hibernate occur in the session. This is where connection to the database is obtained, Structured Query Language (SQL) statements are composed, type conversions are made, and where transactions and the persistence context are managed. Let's start by getting to know the internals of a Hibernate session.

Session internals

If you have written at least one Hibernate application in the past, you already know what an entity and a session are. However, most of the time, developers don't think about the session, the entity, and their lifecycles, or about what occurs inside a session when an entity is saved or updated. There are fundamental concepts that one must understand in order to utilize Hibernate effectively.

The Hibernate session is where persistence work is performed for each thread of execution, and it manages the persistence context. Therefore, it's not thread-safe; this means that multiple threads should not access or use the same session at the same time. As you may know, sessions are created by calling the session factory, and there is only one factory per storage unit, although you can have multiple session factories pointing to different databases.

Furthermore, the session is designed to be a short-lived object. This is an important constraint that is typically imposed by the database and the application server, and this is because there is always a timeout setting on connections and transactions. (There is even timeout setting at the Java Database Connectivity (JDBC) level. Furthermore, you have to worry about TCP Socket timeout.) Even though these settings are set to some number of seconds, you should still avoid long-running logics while the session is open because you may create contention in the database and impact the system performance altogether. Hibernate tries to protect you by not allocating any resources, such as database connections, unless they are absolutely needed. However, you still have to be mindful of the work that you do within the unit of persistence work. As long as you limit the code to persistence-related tasks while you have an open session, you will be fine.

When you create a session factory, Hibernate reads the configuration file and, at first, loads the SQL dialect class. This class includes a mapping for the database types and operations for the specific Relational Database Management System (RDBMS) server, which you work with. Hibernate then adds all these types, as well as the user-defined types to a type resolver. (You will learn about creating custom types in Chapter 2, Advanced Mapping.)

Additionally, when a session factory is created, Hibernate loads all the entities that are added to the configuration. You can add annotated entities to the configuration in the code, or create an entity mapping file and add the entity map to the configuration file. For each entity, Hibernate precomposes the JDBC-prepared SQL statements for select, insert, update, and delete. (This also creates one to support entity versioning, refer to the Envers section, in Chapter 6, Events, Interceptors, and Envers). This is called a static SQL. Some entity types require a dynamic SQL, for example, dynamic entities.

Hibernate also creates internal maps for each entity property and its association with other entities and collections. This is because Hibernate uses reflection to access properties or to invoke methods on your entity. In addition to the entity properties, Hibernate also keeps track of all the annotations that are associated with properties and methods so that it can perform certain operations when needed, such as cascade operations.

When you obtain a session object from the factory, you may get a different implementation of the CurrentSessionContext interface, depending on the session context class. There are three session contexts that are natively supported: Java Transaction API (JTA), thread, and managed contexts. This is set using the current_session_context_class configuration parameter, and Hibernate has reserved the shortcuts, jta, managed, and thread, to expand to the corresponding internal classes. However, you can replace this with any class, which implements org.hibernate.context.spi.CurrentSessionContext.

Note

Starting with version 4, Hibernate has repackaged the classes to follow the OSGi model for more modular, pluggable, and extensible applications and frameworks. Many of the named resources, such as dialect and connection provider, are now managed through services. Services have a lifecycle (initialize, start, and stop), a rich callback API, and provide support for JMX and CDI, among others. There are three main packages in Hibernate, the API packages, the SPI packages, and the internal packages. The classes in the API packages are the ones that we utilize to use Hibernate. The classes in the Service Provider Interface (SPI) are pluggable modules that are typically replaced or provided by vendors who want to implement or override certain components of Hibernate. Finally, the classes in the internal packages are used internally by Hibernate. We will come back to this in Chapter 6, Events, Interceptors, and Envers, when we discuss events.

Transaction management means different things for the internal session contexts. This is an important architectural discussion, which will be covered in detail in Chapter 8, Addressing Architecture. However, in the next section, we will discuss contextual sessions, and for this, we need to define session scope and transaction boundaries.

The persistence unit of work begins when you start a new session. Hibernate will not allow modification to the persistence context without an active transaction. You either begin a local transaction (in the JDBC session context), or one is started by JTA or the managed context. The unit of persistence work ends when you commit or rollback a transaction. This also closes the session automatically, assuming that the default is not overridden. If you start a local transaction and don't commit it or roll back, Hibernate quietly clears the persistence context when you close the session. If you don't close the session after your work is done, you will most definitely have a connection leak. (This behavior varies for different session contexts.)

When you call various methods on the session object, these methods are translated into a corresponding event. For example, the session.save() method is translated into an instance of the SaveOrUpdateEvent class, and the actual operations are managed by event listeners. Each session has a list of event listeners, which perform certain operations for each event that is fired off in the execution path. As another example, when you check to see whether the session is dirty, session.isDirty(), a DirtyCheckEvent event, is fired off to check the action queue to see whether any actions are queued up, and if so, it marks the session as dirty.

So what is the action queue? Most Hibernate events correspond to one or more actions. Each session has an instance of the ActionQueue class that holds a list of various actions. These actions are simple insert, delete, and update actions on entities and collections. While you are working within a session and updating the persistence context, actions get queued up as various events are fired. Finally, at the end of the transaction, on commit, these actions are translated into Data Manipulation Language (DML) statements by the corresponding entity persister classes (for example, SingleTableEntityPersister), which are then executed in the database. (The composition of the SQL statements is managed by classes in the org.hibernate.sql package, which use the dialect classes to form a syntactically correct SQL statement.)

This is basically what happens inside a Hibernate session from the time it is created until it is closed. Next, we will discuss various session contexts and how they differ.

Note

What is the difference between a session and an entity manager?

Session is the Hibernate API to manage the persistence context. Entity manager is its counterpart in the JPA world. Although new versions of Hibernate implement the JPA specifications, you still have a choice to use Hibernate or JPA APIs. Your code is fully portable if you choose the JPA APIs, regardless of the implementation. On the other hand, you will have more control if you choose the Hibernate API.

Isn't the session the same as the persistence context?

No. Besides doing a lot of other things, the session also manages the persistence context, which happens to be its main job. Think of the persistence context as your copy of the database rows in memory, managed by the session. (Obviously, these are only the rows that you are working with.)

Contextual session

The session behaves differently in various contexts. This behavior is defined in terms of session scope, transaction boundaries, and the cleanup work. As mentioned earlier, there are three types of contextual sessions that are natively supported by Hibernate. These are as follows:

JTASessionContext
ThreadLocalSessionContext
ManagedSessionContext

All of these implement the CurrentSessionContext interface. Simply put, the context defines the scope of the current session.

The scope of JTA session context is defined by the transaction that is being managed by JTA. In this case, the current session is bound to the JTA transaction and, therefore, the cleanup is triggered by JTA when lifecycle events are fired off. Once the transaction is committed, the session is flushed, cleared, and then closed. If your application runs in an environment where a transaction manager is deployed, you should always use this context.

The scope of thread local session context is defined by the current thread. This context is best suitable for unit tests or standalone applications, as it is not meant for usage in an enterprise application. In this case, the current session is bound to the current thread and the transaction that you start comes straight from JDBC. If you use the Hibernate transaction API (that is Transaction transaction = session.beginTransaction();), and you should, it will perform the cleanup for you.

The scope of the managed session context is somewhat defined by the current thread, but the scope can expand over multiple threads. In this case, the session outlives the thread that created it and may be flushed and closed by a subsequent thread. In other words, you are defining the scope of the session, and you have to manually handle cleanup work. You are managing the session.

Note

What does flush do?

When you modify the persistence context by adding new entities, or updating the existing one, the database and persistence context are not synchronized until the end of persistence work. Only after flush() is called, changes to Hibernate entities are propagated to the corresponding tables and rows. Hibernate offers very powerful capabilities to manage flush behavior when this synchronization actually occurs. You will see more on this later.

In other words, when you call the getCurrentSession of the Hibernate session factory API, the behavior is as follows:

thread: This session factory API returns the current session that is associated with the current thread. (If one doesn't exist, this will create one and associate it with the current thread.)
jta: This session factory API returns the current session that is associated with the current global transaction. In case of none, one is created and associated with the current global transaction through JTA.
managed: You'll have to use ManagedSessionContext to obtain the correct session for the current thread. This is useful when you want to call multiple data access classes and don't want to pass the session object to each class. Refer to the following discussion on session per operation.

The JTA and threadlocal session contexts might be what you are used to and are easier to understand. The Managed session context is best for long-running conversations that represent a business unit of work, which spans multiple requests. If this sounds a bit cryptic, do not worry; we will come back to this later on.

Session per request

In this design pattern, all the persistence work for each client request is accomplished within one session. If all the business transactions within a unit of work can be encapsulated in one Data Access Object (DAO) implementation, then you can start a session and a transaction at the beginning of your method, and commit at the end. (Don't forget proper exception handling!)

If your business unit of work spans multiple DAO classes, then you have to make sure that they all use the same session. Hibernate makes this easy for you by providing the sessionFactory.getCurrentSession() API. This will allow you to access the session object from anywhere within the same thread.

However, here's the catch. You need to make sure that somebody has started the transaction whose commit and rollback is also delegated appropriately. This can be orchestrated in a service method, where you can start a Hibernate session, begin transaction, and store the session object in a static ThreadLocal session, or pass the session object to each DAO instance, either as constructor argument or passed directly to each method. Once the orchestration is completed, you can commit the transaction.

If you use EJBs, you are in luck! You simply wire EntityManager and declare a DAO method transactional using the @TransactionAttribute annotation, and the EJB container will take care of the rest for you. We will demonstrate this, and another elegant solution using Spring, in Chapter 9, EJB and Spring Context.

Session per conversation

This pattern is used when the business transaction spans over multiple units of persistence work, and the business data is exchanged over multiple consecutive requests with allowed think time in between.

In a sense, the DAO orchestration that we discussed earlier implements this pattern. However, in that case, everything occurred in one client request (one thread of execution): the session was opened, the transaction started, DAO methods were called, the session was flushed and cleared, the transaction was committed, the response was sent to the client, and the thread ended. This is not considered a long-running conversation.

When implementing session per conversation, as the name indicates, the session scope goes beyond a single thread and a single database transaction. This is why a managed session context is best for this pattern. You can control flush behavior so that synchronization doesn't occur until you are ready to perform it.

In order to understand how this works, we need an in-depth understanding of entity lifecycle and transactions. There are various ways of implementing this pattern, and we will cover these later in this book.

Session per operation

This is considered an anti-pattern. It's true that the instantiation of the session object is not expensive, but managing the persistence context and allocating or obtaining connection and transaction resources is expensive. If your business transaction is comprised of multiple persistence operations, you need to ensure that they all happen within the scope of one session. Try not to call multiple DAO methods when each of them creates their own session and start and commit their own transactions. If you are being forced to do this, perhaps it's time to refactor.

Stateless session

There is another type of session that is supported by Hibernate, and this is stateless session. The reason this is called stateless is because there is no persistence context and all entities are considered detached. Additionally, there is the following:

No automatic dirty checking. This means that you have to call session.update() before closing the session; otherwise, no update statement will be executed.
No delayed DML (also known as write-behind). Every save or update will be executed right away. (Refer to the earlier discussion on action queues.)
No cascade operation. You have to handle the associated entities.
No proxy object; hence, no lazy fetching.
No event notification or interceptors.

You should think of stateless sessions as direct calls to JDBC because this is essentially what occurs behind the scenes.

One good reason to use stateless sessions is to perform bulk operations. The memory footprint will be far less and, in some cases, it performs better.

Entity

A Hibernate entity is, typically, a sophisticated Plain Old Java Object (POJO). It is sophisticated because it represents a business model whose data is assumed to be persistent. It's always decorated with various annotations, which enable additional characteristics, among other things. Or, it is configured using an hbm Hibernate mapping XML file. When an entity contains other entities, or a collection of other entities, this implies a database association for which you have to declare the proper mapping configuration to define the relationship type.

An entity can also embed other POJOs that are not entities. In such cases, the other entities are considered value objects. They have no identity, and have little business significance on their own. (We will discuss this further when we talk about the @Embedded and @Embeddable annotations in Chapter 2, Advanced Mapping).

Entity lifecycle

You should already be familiar with entity lifecycle. However, here is a different perspective of the different phases of the lifecycle.

Before discussing the lifecycle of an entity, it is important to not think of an entity as a POJO. Instead, if you keep reminding yourself that an entity is the persistent model of business data, you will easily understand the lifecycle.

The lifecycle begins when you instantiate an entity class. At this point, the entity has no presence in the persistence context; therefore, no data has been inserted in the database and no unique ID is assigned to the new entity. At this phase of the lifecycle, the entity is said to be in the Transient state.

Once you save your new entity by calling session.save(), your entity is now in the Persistent state, because at this point the session is managing it.

What happens to the entities after the session is closed? In this case, your entity has no presence in the persistence context, but it has a presence in the database. This state is called Detached.

There is another state, which is rarely mentioned, and this is the Deleted state. When you call session.delete() on an entity, it will fire off a Delete event and internally sets the entity state to DELETED. As long as the session is open, you can still undelete the entity by calling session.persist().

There are certain lifecycle events that change the entity state, and those are well documented.

Types of entities

As mentioned earlier, you can declare a POJO class as your persistent class. There is another type of entity in Hibernate that is rarely used and perhaps not widely known, and this is map. This is known as a dynamic entity. You can use any implementation of the java.util.Map interface as a Hibernate entity. This is useful to implement a dynamic business model, which is great for the creation of a quick prototype. Ultimately, you are best off with POJO entities. If you need to implement a dynamic entity, first you can set the default entity mode to MAP:

Configuration configuration = new Configuration()
.configure()
.setProperty(Environment.DEFAULT_ENTITY_MODE,
EntityMode.MAP.toString());

Then, add a new mapping configuration. Hibernate uses the property name as a map key to get the value, for example, <property name="firstname" …/>. So, if your map contains other properties which are not included in the named map, they will be ignored by Hibernate:

<hibernate-mapping>
  <class entity-name="DynamicEntity">
    <id name="id" type="long" column="MAP_ID">
      <generator class="sequence" />
    </id>
    <property name="firstname" type="string" column="FIRSTNAME" />
    <property name="lastname" type="string" column="LASTNAME" />
  </class>
</hibernate-mapping>

Make sure that you add the new map to your Hibernate configuration. Now, you can use this as an entity. Note that when you call session.save(), you are passing the name of the entity as the first argument:

    Map<String, String> myMap = new HashMap<String, String>();
    myMap.put("firstname", "John");
    myMap.put("lastname", "Smith");
    
    Session session = HibernateUtil
   .getSessionFactory()
   .getCurrentSession();
    Transaction transaction = session.beginTransaction();

    try {
      session.save("DynamicEntity", myMap); // notice entity name
      transaction.commit();
    }
   catch (Exception e) {
      transaction.rollback();	
      // log error
    }
    finally {
      if (session.isOpen())
        session.close();
    }

Note

This used to be different in version 3.6. You didn't need to set the default entity mode on the configuration. The Session interface provided an API, which would return another session that supported dynamic entity. This was session.getSession(EntityMode.MAP), and this returned a new session, which inherited the JDBC connection and the transaction. However, this was removed in Hibernate 4.

Identity crisis

Each entity class has a property that is marked as the unique identifier of that entity. This could be a primitive data type or another Java class, which would represent a composite ID. (You'll see this in Chapter 2, Advanced Mapping, when we talk about mapping) For now, having an ID is still optional, but in future releases of Hibernate, this will no longer be the case and every entity class must have an ID attribute.

Furthermore, as Hibernate is responsible for generating and setting the ID, you should always protect it by making sure that the setter method for the ID is private so that you don't accidentally set the ID in your code. Hibernate can access private fields using reflection. Hibernate also requires entities to have a no-arg constructor.

In some cases, you have to override the equals() and hashCode() methods, especially if you are keeping the detached objects around and want to reattach them or need to add them to a set. This is because outside of the persistence context the Java equality may fail even though you are comparing two entity instances that represent the same row. The default implementation of the equals() method only checks whether the two instances are the same reference.

If you are sure that both objects have an ID assigned, then, in addition to reference equality check, you can compare their identifiers for equality. If you can't rely on the ID property but an entity can be uniquely identified by a business key, such as user ID or social security number, then you can compare the business keys.

It's not a good idea to compare all properties for equality check. There are several reasons, as follows:

First, if you keep a detached object around and then retrieve it again in another session, Hibernate can't tell that they are the same entities if you modify one of them.
Second, you may have a long list of properties and your code will look messy, and if you add a new property, you may forget to modify the equals() method.
Finally, this approach will lead to equal entities if you have multiple database rows with the same values, and they should be treated as different entities because they represent different rows. (For example, if two people live at the same address and one person moves, you may accidentally change the address for both.)

Beyond JPA

When you decorate your class with annotations, you are empowering your objects with additional features. There are certain features that are provided by Hibernate, which are not available in JPA.

Most of these features are provided through Hibernate annotations, which are packaged separately. Some annotations affect the behavior of your entity, and some are there to make mapping easier and more powerful.

The behavior modifying annotations that are worth noting here are as follows:

@Immutable: This makes an entity immutable.
@SelectBeforeUpdate: This is great for reducing unnecessary updates to the database and reducing contention. However, it does make an extra call through JDBC.
@BatchSize: This can be used to limit the size of a collection on fetch.
@DynamicInsert and @DynamicUpdat: These prevent null properties from being included in the dynamic SQL generation for both insert and update.
@OptimisticLocking: This is used to define the type of optimistic lock.
@Fetch: This can be used to define @FetchMode. You can instruct Hibernate to use Join, Select, or Sub Select. The Join query uses outer join to load the related entities, the Select query issues individual SQL select statements, and Sub Select is self-explanatory. (This is different from JPA's @FetchType, which is used to decide to perform lazy fetch or not.)
@Filter: This is used to limit the entities that are returned.

It's worth exploring both the org.hibernate.annotations JavaDocs and the JPA annotations. We will cover Hibernate annotations (beyond JPA) that modify mappings and associations in the next chapter, when we discuss mappings. We will also return to @Fetch and @Filter when we discuss Fetching in Chapter 4, Advanced Fetching.

Proxy objects

If you are familiar with Java reflection, you have heard of the java.lang.reflect.Proxy class. In a nutshell, you can wrap any object with a proxy and intercept calls to methods of that object using an invocation handler. Many Java frameworks use proxy objects, or manipulate bytecode (also called instrumentation) to modify the behavior of an object. Hibernate uses both ways for different purposes. More importantly, Hibernate implements its own set of wrapper classes for collection types. (Refer to classes in org.hibernate.collection.internal.)

If you fetch an entity, Hibernate doesn't fetch the associated collections if they are marked as lazy fetch. Instead, it waits until you actually try to access the associated collection. As soon as you access an entity in the associated collection, Hibernate will fetch the associated entities from the persistence store and will then populate the wrapped collection for you to access. Hibernate accomplishes this using the internal collection wrappers. You can actually examine this yourself by writing a simple check, as follows:

parent = (Parent) session.get(Parent.class, new Long(1));
Set<Child> children = parent.getChildren();
if (children instanceof PersistentSet) {
  System.out.println("**** Not java.util.Set");
}

// PersistentSet is located in org.hibernate.collection.internal

Hibernate uses byte code manipulation techniques to initialize an object that is uninitialized. This usually occurs when your entity has an associated entity, for example, a Person entity that is associated with an Address entity. When the root entity, in this case Person, is loaded from the database, the Address object is not initialized in case of LAZY loading. In such cases, Hibernate returns a manipulated version of the associated entity, and as soon as you try to access any of the attributes of the associated entity, for example, address.getStreet(), Hibernate will hit the database to fetch the values for the associated entity and initialize it.

Hibernate also returns a proxy object when you ask for an entity using the load method instead of the get method of the Session class.

Note

The byte code manipulation is achieved using the Javassist library.

When working with Hibernate, it is important that you keep in mind how Hibernate uses proxy objects.

Batch processing

When you interact with the session by saving entities or fetching them from the DB, Hibernate keeps them around in the persistent context until the session is closed, or until you evict the object or clear the session. This is Hibernate's first-level cache.

Care must be taken when executing queries that load many objects or when trying to save a large set of entities. If you don't perform some cleanup work, your JVM will run out of memory in the middle of the work unit.

There are certain things you can do to avoid such situations. Some of them are manual work, and others are managed by Hibernate if you provide enough hints or if you use the right session type.

Note

How does Hibernate know whether it should call JDBC executeBatch? This decision is made in the entity persister, which is responsible for persisting an entity via JDBC. Hibernate keeps track of all the DML statements for each entity type, and when the statement count is more than 1 for a particular entity type, it will use batch execution.

Manual batch management

One way to ensure that batch processing is under control is by manually managing the population of the first-level cache, that is, the persistent context. If you are saving or updating a batch of entities, you can occasionally flush and clear the session. Flushing the session will execute all the pending SQL statements, and when you clear the session, all the entities are evicted from the persistent context.

You can do this by forcing a flush and clear, as follows:

  public void saveStudents(List<Map<String, String>> students) {
    final int batchSize = 15;
    Session session = HibernateUtil
   .getSessionFactory()
   .getCurrentSession();
    Transaction transaction = session.beginTransaction();
    try {
      int i = 0;
      for (Map<String, String> studentMap:students) {
        i++;
        Student student = new Student();
        student.setFirstname(studentMap.get("firstname"));
        student.setLastname(studentMap.get("lastname"));
        session.save(student);
        
        if (i % batchSize == 0) {
          session.flush();
          session.clear();
        }
      }
      transaction.commit();
    }
    catch (Exception e) {
      transaction.rollback();
      // log stack trace
    }
    finally {
      if (session.isOpen())
        session.close();
    }
  }

You should use the same mechanism when you are fetching entities. There is a slight performance hit when you flush and clear the session. However, this is not significant. Your JDBC connection is still open, and the transaction is still active, and these are the expensive resources whose lifecycle you need to be concerned with in your design. (Refer to the earlier discussion on contextual session.)

Setting batch size

When you flush the session, you are essentially submitting the appropriate SQL statements to the JDBC layer. In the JDBC world, you can either execute a single statement, or you can batch statements and when ready, execute the batch (refer to the java.sql.Statement.addBatch(…) and executeBatch() methods).

There is no batch size in JDBC, but Hibernate uses the property called jdbc.batch_size to control how many entities will be in a batch. This doesn't mean that if you set this, you don't have to worry about memory exhaustion; you still have to manually manage the persistent context for a large sized batch. This just means that when Hibernate determines that it can batch DML statements, how many times does it call addBatch(…) before calling executeBatch().

There is another batch size setting, which comes in the form of annotation, and this is @BatchSize, which is used to decorate an entity class. This setting is not for batch inserts or updates; this is used at fetch time for collections and entities when they are loaded lazily.

Using stateless session

Stateless session was introduced earlier. As there is no persistent context in a stateless session, you don't need to flush or clear the session. All the changes to your entities are reflected immediately in the database as there is no delayed write. Remember, there is no cascade operation on associated entities, and the associated collections are ignored. So, you have to manually manage all the entities.