In our previous example we saw how a default fetch strategy(fetch = "select") could result in N+1 selection problem for Hibernate Associations. One way to solve this problem would be to use batch-size property
In this case Hibernate works on initializing Book sets for multiple shelves, when the set for a single Shelf is to fetched. Thus in a single book select, multiple collections are initialized. The hbm files for Shelf is now as below:
A total of 4 shelves were loaded by the HQL query. When we tried to access the set for the first book, Hibernate pre-fetched the collection for the first three shelves. It then executed only one more select clause to load the last collection. Thus the number of secondary selects fired has been reduced by the batch-size. Instead of N+1 select queries we now have only (N/batch-size) +1 queries.
If the batch-size was 5 then the books select query for the above example would be
Query 1 : 5 records (for shelfs with id 1-5)
Query 2 : 5 records (for shelfs with id 6-10)
Query 3 : 5 records (for shelfs with id 11-15)
Query 4 : 3 records (for shelfs with id 16-18).
4 secondary queries instead of 18 secondary queries
This does not imply that Hibernate can create any number of loaders. E.g for a batch size of 20 and 119 records the assumption would be
Batch fetching functionality is not just limited to collections. You can also use it for associations. Consider that you fire a query to load a list of books. Now if were to access the Shelf reference for each book, it wold result in n queries for n books. or the N+1 Problem.
However if you add the following attribute in your Shelf.hbm file:
In the next post we shall attempt a two query solution for the whole N+1 problem.
In this case Hibernate works on initializing Book sets for multiple shelves, when the set for a single Shelf is to fetched. Thus in a single book select, multiple collections are initialized. The hbm files for Shelf is now as below:
<?xml version="1.0"?> <!DOCTYPE hibernate-mapping PUBLIC "-//Hibernate/Hibernate Mapping DTD 3.0//EN" "http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd"> <hibernate-mapping package="com.collection.smart"> <class name="Shelf" table="SHELF"> <id name="id" type="integer"> <column name="ID" /> <generator class="native" /> </id> <property name="code" type="string"> <column name="CODE" length="50" not-null="true" /> </property> <set name="allBooks" cascade="all-delete-orphan"
inverse="true" batch-size="3" > <key column="SHELF_ID" not-null="true" /> <one-to-many class="Book" /> </set> </class> </hibernate-mapping>Observe the set element. The fetching property is still unchanged. It is the default select value. However the batch-size has been set to three here. Now if were to execute a select query against The Shelf class:
public static void testBatchedLoad() { Session session = sessionFactory.openSession(); Transaction transaction = session.beginTransaction(); System.out.println("Testing Query select"); session = sessionFactory.openSession(); transaction = session.beginTransaction(); @SuppressWarnings("unchecked") List<Shelf> shelfs = session.createQuery("from Shelf shelf
where shelf.id <= 5 ").list(); //will return four books from database for (Shelf shelf : shelfs) { System.out.println("shelf class is " + shelf.getClass()); Set<Book> books = shelf.getAllBooks(); System.out.println("The Collection class is " + shelf.getAllBooks().getClass()); System.out.println("Total books is " + books.size()); } transaction.commit(); session.close(); }The HQL query loads a list of Shelf entities and then displays the number of books on each shelf.
Testing Query select Hibernate: /* from Shelf shelf where shelf.id <= 5 */
select
shelf0_.ID as ID1_,
shelf0_.CODE as CODE1_
from
SHELF shelf0_
where
shelf0_.ID<=5
shelf class is class com.collection.smart.Shelf
The Collection class is class org.hibernate.collection.PersistentSet
Hibernate:
/* load one-to-many com.collection.smart.Shelf.allBooks */
select allbooks0_.SHELF_ID as SHELF3_1_, allbooks0_.ID as ID1_, allbooks0_.ID as ID0_0_, allbooks0_.Name as Name0_0_, allbooks0_.shelf_id as shelf3_0_0_ from BOOK allbooks0_ where allbooks0_.SHELF_ID in ( ?, ?, ? ) Total books is 2 shelf class is class com.collection.smart.Shelf The Collection class is class org.hibernate.collection.PersistentSet Total books is 1 shelf class is class com.collection.smart.Shelf The Collection class is class org.hibernate.collection.PersistentSet Hibernate: /* load one-to-many com.collection.smart.Shelf.allBooks */
select
allbooks0_.SHELF_ID as SHELF3_1_,
allbooks0_.ID as ID1_,
allbooks0_.ID as ID0_0_,
allbooks0_.Name as Name0_0_,
allbooks0_.shelf_id as shelf3_0_0_
from
BOOK allbooks0_
where
allbooks0_.SHELF_ID=?
Total books is 1
shelf class is class com.collection.smart.Shelf
The Collection class is class org.hibernate.collection.PersistentSet
Total books is 1
As can be seen above the select query executed to load the Book collection object is an IN query. It is designed to take three ids as we have a batch-size of 3. A total of 4 shelves were loaded by the HQL query. When we tried to access the set for the first book, Hibernate pre-fetched the collection for the first three shelves. It then executed only one more select clause to load the last collection. Thus the number of secondary selects fired has been reduced by the batch-size. Instead of N+1 select queries we now have only (N/batch-size) +1 queries.
If the batch-size was 5 then the books select query for the above example would be
Testing Query select /* load one-to-many com.collection.smart.Shelf.allBooks */
select allbooks0_.SHELF_ID as SHELF3_1_, allbooks0_.ID as ID1_, allbooks0_.ID as ID0_0_, allbooks0_.Name as Name0_0_, allbooks0_.shelf_id as shelf3_0_0_ from BOOK allbooks0_ where allbooks0_.SHELF_ID in ( ?, ?, ?, ? )As can be seen Hibernate will fire only one query replacing the four question marks with the ids of the four shelf objects.Thus if we had 18 Shelf records in the result-set then the collections would be pre-fetched as below: (batch-size = 5)
Query 1 : 5 records (for shelfs with id 1-5)
Query 2 : 5 records (for shelfs with id 6-10)
Query 3 : 5 records (for shelfs with id 11-15)
Query 4 : 3 records (for shelfs with id 16-18).
4 secondary queries instead of 18 secondary queries
This does not imply that Hibernate can create any number of loaders. E.g for a batch size of 20 and 119 records the assumption would be
- 5 queries for fetching 20 records each ( 1 to 100)
- 1 query for fetching 19 records ( 101 to 119)
- 5 queries for fetching 20 records each ( 1 to 100)
- 1 query for fetching 10 records (101 to 110)
- 1 query for fetching last 9 records (110 to 119)
Batch fetching functionality is not just limited to collections. You can also use it for associations. Consider that you fire a query to load a list of books. Now if were to access the Shelf reference for each book, it wold result in n queries for n books. or the N+1 Problem.
However if you add the following attribute in your Shelf.hbm file:
<class name="Shelf" table="SHELF" batch-size="5">Now when we iterate over the set the Shelf records will be loaded by Hibernate in batches of 6.
In the next post we shall attempt a two query solution for the whole N+1 problem.
Not too bad, hey.
ReplyDelete