Search This Blog

Saturday, 28 July 2012

The N +1 Select Problem in Hibernate

Consider our simple Book and Shelf relation. Every shelf has a collection of Books. As we have not applied any optimization strategies, when we request for a Shelf object from the database, a Shelf proxy is loaded. Then when we access a scalar property (like code) the Shelf scalar properties get loaded using a select query. The Book set is represented by a proxy. Now if we try to access the collection a second fetch is needed.
Consider the below code
public static void testFetchSelect() {        
    Session session = sessionFactory.openSession();
    Transaction transaction = session.beginTransaction();
    
    Shelf shelf1 = (Shelf) session.load(Shelf.class, 1);
    System.out.println("The Shelf class is " + shelf1.getClass());
    System.out.println("The shelf1 code is " + shelf1.getCode());
    System.out.println("The Collection class is " + shelf1.getAllBooks().getClass());
    System.out.println("The number of books in shelf1 is " + shelf1.getAllBooks().size());
    transaction.commit();        
    session.close();
}
The above code loads a Shelf entity and displays the number of books on the shelf.
The output logs are as below:
The Shelf class is com.collection.smart.Shelf$$EnhancerByCGLIB$$c94b6717
Hibernate: 
    /* load com.collection.smart.Shelf */ 
    select
        shelf0_.ID as ID1_0_,
        shelf0_.CODE as CODE1_0_ 
    from
        SHELF shelf0_ 
    where
        shelf0_.ID=?
The shelf1 code is SH001
The Collection class is org.hibernate.collection.PersistentSet
Hibernate: 
    /* load one-to-many com.collection.smart.Shelf.allBooks */ 
    select
        allbooks0_.SHELF_ID as SHELF3_1_,
        allbooks0_.ID as ID1_,
        allbooks0_.ID as ID0_0_,
        allbooks0_.Name as Name0_0_,
        allbooks0_.shelf_id as shelf3_0_0_ 
    from
        BOOK allbooks0_ 
    where
        allbooks0_.SHELF_ID=?
The number of books in shelf1 is 2
As can be seen initially a Book proxy was loaded. On trying to display the code via a getter, the first select query was fired.
The Collection is not loaded yet. it is represented in memory by a Hibernate class PersistentSet. On trying to read the book count, a second select query if fired which now loads all the books in the collection thereby initializing it.
Thus a total of 2 queries were fired. One to load the Entity and the second to load the collection.
Now consider the below code.
public static void testNPlus1Select() {
    Session session = sessionFactory.openSession();
    Transaction transaction = session.beginTransaction();
    @SuppressWarnings("unchecked")
    List<Shelf> shelfs = session.createQuery("from Shelf shelf where shelf.id = 1 
              or shelf.id = 2").list();
        
    for (Shelf shelf : shelfs) {
        Set<Book> books = shelf.getAllBooks();
        System.out.println("shelf class is " + shelf.getClass());
        System.out.println("Total books is " + books.size());
    }
    transaction.commit();
    session.close();
}
The code executes a query that will return a list of shelves. Two shelves at most. Each Shelf internally has a Collection proxy. If we access the collection, than a second query will be fired to return the set of books associated with the shelf entity. The logs indicate the same:
Testing Query select
Hibernate: 
    /* 
from
    Shelf shelf 
where
    shelf.id = 1 
    or shelf.id = 2 */ 
    select
        shelf0_.ID as ID1_,
        shelf0_.CODE as CODE1_ 
    from
        SHELF shelf0_ 
    where
        shelf0_.ID=1 
        or shelf0_.ID=2
shelf class is class com.collection.smart.Shelf
Hibernate: 
    /* load one-to-many com.collection.smart.Shelf.allBooks */ 
    select
        allbooks0_.SHELF_ID as SHELF3_1_,
        allbooks0_.ID as ID1_,
        allbooks0_.ID as ID0_0_,
        allbooks0_.Name as Name0_0_,
        allbooks0_.shelf_id as shelf3_0_0_ 
    from
        BOOK allbooks0_ 
    where
        allbooks0_.SHELF_ID=?
Total books is 2
shelf class is class com.collection.smart.Shelf
Hibernate: 
    /* load one-to-many com.collection.smart.Shelf.allBooks */ 
    select
        allbooks0_.SHELF_ID as SHELF3_1_,
        allbooks0_.ID as ID1_,
        allbooks0_.ID as ID0_0_,
        allbooks0_.Name as Name0_0_,
        allbooks0_.shelf_id as shelf3_0_0_ 
    from
        BOOK allbooks0_ 
    where
        allbooks0_.SHELF_ID=?
Total books is 2
As can be seen a total of 3 queries were fired. The first query loaded 2 shelves. Than for each shelf one additional select query was fired. Thus for n shelves, n select book queries would be fired. These n queries + the initial selection query is known as the N+1 query problem.
In upcoming posts we shall look at the alternatives available in entity configuration to avoid this problem.

1 comment:

  1. Hello there, good post.
    Is there a follow on post to this as mentioned in this article ?

    ReplyDelete