Search This Blog

Saturday, 2 June 2012

Hibernate And Batch Operations

There are often scenarios in our application wherein we need to perform batch updates. For example if there was a need to update the zip code of all records in Address table a batch update would be a very good way to achieve the same. A single update SQL query would ensure that the job is done at the database level.
However there arise complications in big applications wherein the update operation would be much more complex then a singe update query or the code complexity better handled in java then in a stored procedure.
The Hibernate equivalent would involve
  1. opening a session, 
  2. executing a select query to get the records to update.
  3. On each object, run the update logic. 
  4. Then schedule the affected object for a save. 
  5. Once all entities are modified we commit our transaction thus flushing all modifications to the database.
The problem with the approach is if the objects run into hundreds and thousands and even bigger values, loading such object graphs into the session might at some point result in an Out Of Memory Error.
Hibernate provides an alternative to handle this situation ( or else I wouldn't be writing this post :p ) Here too we execute the query to identify the affected objects. However instead of loading all of them into the session, Hibernate provides us with a reference to a database cursor. Instead of retrieving all objects in the result-set we retrieve one at a time. Consider the below code.
static void testBatchUpdate() {
    final int BATCH_SIZE = 5;
    final String query = "from Entity where id < :id";
    Session session = sessionFactory.openSession();
    Transaction transaction = null;
    ScrollableResults allEntities = null;
    try {
        transaction = session.beginTransaction();
        allEntities = session.createQuery(query).setLong("id", 7).scroll();
        int updateCount = 0;
        while ( {//Advance to the next result
            final Entity entity = (Entity) allEntities.get(0);
            System.out.println("Updating entity with id  - "
                    + entity.getId());
            entity.setName("Updated new Name " + entity.getId());
            if (++updateCount % BATCH_SIZE == 0) {
                System.out.println(" Total records to be flushed as yet : "
                        + updateCount);
                System.out.println("performing db session flush and clear");
        System.out.println("Total queries fired : " + updateCount);
    } catch (Exception e) {
The hero of the story is the ScrollableResults class. The class internally is holding the reference to an open cursor in the database. The HQL query was executed using the Query.scroll() method. Unlike the list() method which fetches the entire result the scroll method returns a ScrollableResults object. 
Every time we call the get method Hibernate retrieves a single entity and adds it to the session (also the persistence cache) making it available to us. To ensure that no memory issues occur we periodically flush and clear the session (in this case after every 5 reads) The logs indicate the story
2750 [main] DEBUG org.hibernate.SQL  - 
        entity0_.ID as ID0_,
        entity0_.NAME as NAME0_ 
        ENTITY entity0_ 
2890 [main] DEBUG org.hibernate.loader.Loader  - Initializing object from Result
Set: [com.batch.scrollable.Entity#1]
2922 [main] DEBUG org.hibernate.loader.Loader  - Initializing object from Result
Set: [com.batch.scrollable.Entity#2]
2953 [main] DEBUG org.hibernate.loader.Loader  - Initializing object from Result
Set: [com.batch.scrollable.Entity#3]
2953 [main] DEBUG org.hibernate.loader.Loader  - Initializing object from Result
Set: [com.batch.scrollable.Entity#4]
2953 [main] DEBUG org.hibernate.loader.Loader  - Initializing object from Result
Set: [com.batch.scrollable.Entity#5]
 Total records to be flushed as yet : 5
performing db session flush and clear
2953 [main] DEBUG org.hibernate.event.def.AbstractFlushingEventListener  - flush
ing session
2968 [main] DEBUG org.hibernate.event.def.AbstractFlushingEventListener  - Flush
ed: 0 insertions, 5 updates, 0 deletions to 5 objects
2984 [main] DEBUG org.hibernate.SQL  - 
3000 [main] DEBUG org.hibernate.event.def.DefaultFlushEntityEventListener  - Upd
ating entity: [com.batch.scrollable.Entity#6]
An alternative to using ScrollableResults would be to go for a Stateless session.

1 comment: