Search This Blog

Loading...

Sunday, 25 November 2012

Replicating Records Across Tables

There could be a scenario where we need that records from one database to be copied to another database.
Maybe we have an application and we would like to copy its data into another database for analysis. It could be that running export scripts is not an option. Hibernate in this case provides us with the ability to read the data from the table in one database and insert it into another table in the second database. Or to put things better, Hibernate allows us to replicate the objects.
I have two databases - srcDb and targetDb. I need to copy all Entity records from srcDb to targetDb.
The first step was to create two sessionFactories - so that I could access both databases.
hibernate-src-db.cfg.xml 
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE hibernate-configuration PUBLIC
            "-//Hibernate/Hibernate Configuration DTD 3.0//EN"
            "http://hibernate.sourceforge.net/hibernate-configuration-3.0.dtd">
<hibernate-configuration>
    <session-factory>
        <property name="connection.driver_class">com.mysql.jdbc.Driver</property>
        <property name="dialect">org.hibernate.dialect.MySQLDialect</property>
        <property name="hibernate.connection.username">root</property>
        <property name="hibernate.connection.password">root</property>
        <property name="hibernate.connection.url">jdbc:mysql://localhost:3306/srcDb</property>
        <property name="hibernate.use_identifier_rollback">true</property>
        <property name="hibernate.format_sql">true</property>
        <property name="show_sql">true</property>

        <mapping resource="com/model/Entity.hbm.xml" />
    </session-factory>
</hibernate-configuration>
hibernate-target-db.cfg.xml
The contents of the second configuration file is also same except for the database name:
<property name="hibernate.connection.url">jdbc:mysql://localhost:3306/targetDb</property>
I used the same entity class in both the cfg files.
Now to write the code for replicating Entity records.
static SessionFactory srcSessionFactory, targetSessionFactory;
    
public static void main(String[] args) {
    Configuration configuration = new Configuration().configure("hibernate-src-db.cfg.xml");
    srcSessionFactory = configuration.buildSessionFactory();
    configuration = new Configuration().configure("hibernate-target-db.cfg.xml");
    targetSessionFactory = configuration.buildSessionFactory();
    replicate();
    srcSessionFactory.close();
    targetSessionFactory.close();
}
    
public static void replicate() {
    Session srcSession = srcSessionFactory.openSession();
    Entity entity = (Entity) srcSession.get(Entity.class, 1);
    Session targetSession = targetSessionFactory.openSession();
    Transaction transaction = targetSession.beginTransaction();
    targetSession.replicate(entity, ReplicationMode.OVERWRITE);
    targetSession.flush();
    transaction.commit();
    targetSession.close();
    srcSession.close();
}
The replicate method is where all the action is :
  1. I opened two different sessions to the two different databases.
  2. Using srcSession I read an entity object from srcDb .
  3. I then replicated the same object to the targetDb using the targetSession.
The logs generated is as below:
2421 [main] DEBUG org.hibernate.SQL  - 
    select
        entity0_.id as id0_0_,
        entity0_.DATA as DATA0_0_ 
    from
        Entity entity0_ 
    where
        entity0_.id=?
...
2500 [main] DEBUG org.hibernate.SQL  - 
    select
        id 
    from
        Entity 
    where
        id =?
...
2297 [main] DEBUG org.hibernate.persister.entity.AbstractEntityPersister  - Inse
rting entity: com.model.Entity (native id)
...
2531 [main] DEBUG org.hibernate.SQL  - 
    insert 
    into
        Entity
        (DATA) 
    values
        (?)
As can be seen the data was replicated. But the identifier value was not copied. Instead the native generation strategy was used. This is probably not what you are looking for in replication.
To prevent this we could use the assigned strategy. The insert query immediately changed to :
    insert 
    into
        Entity
        (DATA, id) 
    values
        (?, ?)
What is the select query for fetching an Entity object on the identifier for ?
I think this query was executed on the target database and probably has something to do with the Replication mode. In the example I used ReplicationMode.OVERWRITE. In this ReplicationMode the targetDb will overwrite existing rows when a row with the same identifier already exists in target database. If I run the above code again:
2500 [main] DEBUG org.hibernate.SQL  - 
    select
        id 
    from
        Entity 
    where
        id =?
2516 [main] DEBUG org.hibernate.event.def.DefaultReplicateEventListener  - found
 existing row for [com.model.Entity#1]
2516 [main] DEBUG org.hibernate.event.def.DefaultReplicateEventListener  - repli
cating changes to [com.model.Entity#1]
2563 [main] DEBUG org.hibernate.SQL  - 
    update
        Entity 
    set
        DATA=? 
    where
        id=?
As can be seen Hibernate simply updated the existing record. Other ReplicationModes are
  1. ReplicationMode.IGNORE - In this case the object is ignored if it is found to exists in the target database.
    2390 [main] DEBUG org.hibernate.SQL  - 
        select
            id 
        from
            Entity 
        where
            id =?
    2390 [main] DEBUG org.hibernate.event.def.DefaultReplicateEventListener  - found
     existing row for [com.model.Entity#1]
    2390 [main] DEBUG org.hibernate.event.def.DefaultReplicateEventListener  - no ne
    ed to replicate
    
  2. ReplicationMode.EXCEPTION - In this case, an exception is thrown if the record is found o exists in the target database.
    2328 [main] DEBUG org.hibernate.SQL  - 
        insert 
        into
            Entity
            (DATA, id) 
        values
            (?, ?)
    2375 [main] DEBUG org.hibernate.util.JDBCExceptionReporter  - Could not execute 
    JDBC batch update [insert into Entity (DATA, id) values (?, ?)]
    java.sql.BatchUpdateException: Duplicate entry '1' for key 'PRIMARY'
        at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java
    :1269)
    
  3. ReplicationMode.LATEST_VERSION - In this case,a row already exists, Hibernate will choose the latest version. I think for this to work, version settings is necessary. In my example it results in an update query being fired.
What happens when associations are involved ?
I added a simple association and a collection instance to my Entity class.
public class Entity {
    private Integer id;
    private String data;
    private MasterData masterData;
    private Set<Child> children = new HashSet<Child>();
        //setter getters
}
On executing the code again (with ReplicationMode.OVERWRITE) the generated queries are as below:
2859 [main] DEBUG org.hibernate.SQL  - 
    select
        id 
    from
        Entity 
    where
        id =?
2875 [main] DEBUG org.hibernate.engine.Cascade  - processing cascade ACTION_REPL
ICATE for: com.complex_model.Entity
2875 [main] DEBUG org.hibernate.engine.Cascade  - done processing cascade ACTION
_REPLICATE for: com.complex_model.Entity
Exception in thread "main" org.hibernate.HibernateException: Illegal attempt to 
associate a collection with two open sessions
    at org.hibernate.collection.AbstractPersistentCollection.setCurrentSession(Abst
ractPersistentCollection.java:410)
    at org.hibernate.event.def.OnReplicateVisitor.processCollection(OnReplicateVisi
tor.java:47)
I commented the collection and tried again.The logs are as below:
2782 [main] DEBUG org.hibernate.SQL  - 
    select
        id 
    from
        Entity 
    where
        id =?
3297 [main] DEBUG org.hibernate.event.def.DefaultReplicateEventListener  - no ex
isting row, replicating new instance [com.complex_model.Entity#1]
3328 [main] DEBUG org.hibernate.persister.entity.AbstractEntityPersister  - Inse
rting entity: [com.complex_model.Entity#1]
3328 [main] DEBUG org.hibernate.jdbc.AbstractBatcher  - about to open PreparedSt
atement (open PreparedStatements: 0, globally: 0)
3328 [main] DEBUG org.hibernate.SQL  - 
    insert 
    into
        Entity
        (DATA, masterData, id) 
    values
        (?, ?, ?)
JDBC batch update [insert into Entity (DATA, masterData, id) values (?, ?, ?)]
java.sql.BatchUpdateException: Cannot add or update a child row: a foreign key c
onstraint fails (`targetdb`.`entity`, CONSTRAINT `FK7C02D003B86C8DB9` FOREIGN KE
Y (`masterData`) REFERENCES `masterdata` (`id`))
    at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java
:1269)
As can be seen cascade concept does not work here :) We will first have to replicate the Master table, then the entities and then the Children. Guess now a database dump and export seems much more easier :)

2 comments:

  1. Hi Robin,

    Can you please let me know how to copy a row fron table1 into table2 of same DB and Schema. Both Table1 and Table2 has same Structure.

    Thanks

    ReplyDelete
  2. Hi Shirish,
    If you want to copy record from table 1 into table 2 using Hibernate the first requirement would be that both tables have to be mapped. That means two Java classes. (You cannot map the same Java class to two different tables)
    Once you have done this I see two ways to proceed:
    1->Load a source object, write a copy method or use BeanUtils.copyProperties to copy properties to a new target object and then persist the target object. However of you have a huge data set you might run into Memory issues. In that case use a Stateless session or even better a ScrollableResults
    2-> Take advantage of the Query Interfaces executeUpdate method to directly insert several records in the database.

    If you do now want to map your target table in Hibernate, then you can use use the session.createSQLQuery() method to execute an insert query - as its is SQL, you are working directly with the database
    But then if you can bypass hibernate there are so many more options :)

    ReplyDelete