[Biojava-l] loading multiple records for same organism and peristance in BioSQL

Doug Brown debrown at unity.ncsu.edu
Wed Apr 16 12:01:52 UTC 2008


Greetings,
I am happily climbing the learning curve for Biojava-live, Biojavax,
and BioSQL. I believe that I am using the latest releases, Biojava 1.6
and BioSQL 1.0, in that I have performed the installation within the
past week.

I am attempting to load, via Biojavax, multiple genbank records for the
same organism (a whole genome's worth of annotations) and to save those
into a BioSQL database via Biojavax's Hibernate persistence mechanism.
Loading a second genbank file (same organism, different sequence) croaks
with the error: SEVERE: Duplicate entry 'genbankBiosqlRich' for key 2
...... could not insert: [Namespace]. FYI two sample genbank records
are  CH476760.gb and CH476761.gb and were obtained directly from genbank.

Never having used Hibernate before nor its type of database abstraction,
I think that I am properly handling the transaction semantics. Either I
am violating unspoken presumptions of the persistence paradigm or the
behavior of RichSequence.IOTools.readGenbankDNA is not what I expected.
I had presumed that the above routine would use the established
RichObjectFactory to obtain new or extant objects and then populate
those objects with values from the sequence file. This only seems to
happen when I load multiple sequences from a single file. Multi file
operations fail dismally.

What is the proper way of using Biojava to load up a database with records?

In advance, thank you all for the traffic on this list, it has been
quite helpful in bringing me up to speed.

Regards,
Doug Brown

Here is the relevant [hacked] subroutine:
  /**
   * This works for genbank files containing multiple sequences.
   * Originaly concept from:
http://portal.open-bio.org/pipermail/biojava-l/2007-April/005824.html
   * It fails on inserting existant record(s) - does not replace...
   * This causes grief when loading multiple files...
   */
  public void loadNSave( Session session, File fileName)
    {
    boolean localSession = (session == null);
    Transaction tx = null;

    try
      {
      System.out.println( "*********** Loading "+fileName+"...");
      BufferedReader br = new BufferedReader( new FileReader(  fileName) );

      if ( session == null)  // create a local session
        {
        session = sessionFactory.openSession();
        RichObjectFactory.connectToBioSQL(session);
        }

      // load the objects. I expect this to use the established factory.
      RichSequenceIterator rsi = RichSequence.IOTools.readGenbankDNA(
br, new
          SimpleNamespace( "genbankBiosqlRich") );

      while ( rsi.hasNext() )
        tx = session.beginTransaction(); // Hibernate requires transactions.

        System.out.println( "*********** Loading next sequence...");
        // ??should automatically fetch existing objects from the
database...
        RichSequence sequence = rsi.nextRichSequence();
        System.out.println( "loaded sequence
"+sequence.getAccession()+", identifier: "+ sequence.getIdentifier());

        try
          {
          System.out.println( "*********** saving...");

          // synchronize in-memory representation w/ the database
          // HUGE amounts of time spent doing selects on keys - really
slows things down!!
          session.saveOrUpdate( "Sequence", sequence );
          tx.commit();    // save to database - does an automatic flush
          // batch operations overwhelm the cache - clear it out!
          session.flush();  // force in-memory to disk.
          session.clear();  // clean out cache.
          }
        catch (HibernateException ex)
          {
          tx.rollback();   // discard the sequence and all its annotations
          ex.printStackTrace();
          }
        }
      }
    catch (FileNotFoundException ex)
      {
      ex.printStackTrace();
      }
    catch ( BioException bex)
      {
      bex.printStackTrace();
      }
    finally
      {
      if ( localSession)
        {
        session.flush();  // force in-memory to disk.
        session.close();  // only for local sessions
        }
      }
    }

and the following following is a sample stack dump:

org.hibernate.exception.ConstraintViolationException: could not insert:
[Namespace]
       at
org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:71) 



       at
org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:43) 



       at
org.hibernate.id.insert.AbstractReturningDelegate.performInsert(AbstractReturningDelegate.java:40) 



       at
org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:2163) 



       at
org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:2643) 



       at
org.hibernate.action.EntityIdentityInsertAction.execute(EntityIdentityInsertAction.java:51) 



       at org.hibernate.engine.ActionQueue.execute(ActionQueue.java:279)
       at
org.hibernate.event.def.AbstractSaveEventListener.performSaveOrReplicate(AbstractSaveEventListener.java:298) 



       at
org.hibernate.event.def.AbstractSaveEventListener.performSave(AbstractSaveEventListener.java:181) 



       at
org.hibernate.event.def.AbstractSaveEventListener.saveWithGeneratedId(AbstractSaveEventListener.java:107) 



       at
org.hibernate.event.def.DefaultSaveOrUpdateEventListener.saveWithGeneratedOrRequestedId(DefaultSaveOrUpdateEventLi 



stener.java:187)
       at
org.hibernate.event.def.DefaultSaveOrUpdateEventListener.entityIsTransient(DefaultSaveOrUpdateEventListener.java:1 



72)
       at
org.hibernate.event.def.DefaultSaveOrUpdateEventListener.performSaveOrUpdate(DefaultSaveOrUpdateEventListener.java 



:94)
       at
org.hibernate.event.def.DefaultSaveOrUpdateEventListener.onSaveOrUpdate(DefaultSaveOrUpdateEventListener.java:70) 



       at
org.hibernate.impl.SessionImpl.fireSaveOrUpdate(SessionImpl.java:507)
       at org.hibernate.impl.SessionImpl.saveOrUpdate(SessionImpl.java:499)
       at
org.hibernate.engine.CascadingAction$5.cascade(CascadingAction.java:218)
       at org.hibernate.engine.Cascade.cascadeToOne(Cascade.java:268)
       at org.hibernate.engine.Cascade.cascadeAssociation(Cascade.java:216)
       at org.hibernate.engine.Cascade.cascadeProperty(Cascade.java:169)
       at org.hibernate.engine.Cascade.cascade(Cascade.java:130)
       at
org.hibernate.event.def.AbstractSaveEventListener.cascadeBeforeSave(AbstractSaveEventListener.java:431) 



       at
org.hibernate.event.def.AbstractSaveEventListener.performSaveOrReplicate(AbstractSaveEventListener.java:265) 



       at
org.hibernate.event.def.AbstractSaveEventListener.performSave(AbstractSaveEventListener.java:181) 



       at
org.hibernate.event.def.AbstractSaveEventListener.saveWithGeneratedId(AbstractSaveEventListener.java:107) 



       at
org.hibernate.event.def.DefaultSaveOrUpdateEventListener.saveWithGeneratedOrRequestedId(DefaultSaveOrUpdateEventLi 



stener.java:187)
       at
org.hibernate.event.def.DefaultSaveOrUpdateEventListener.entityIsTransient(DefaultSaveOrUpdateEventListener.java:1 



72)
       at
org.hibernate.event.def.DefaultSaveOrUpdateEventListener.performSaveOrUpdate(DefaultSaveOrUpdateEventListener.java 



:94)
       at
org.hibernate.event.def.DefaultSaveOrUpdateEventListener.onSaveOrUpdate(DefaultSaveOrUpdateEventListener.java:70) 



       at
org.hibernate.impl.SessionImpl.fireSaveOrUpdate(SessionImpl.java:507)
       at org.hibernate.impl.SessionImpl.saveOrUpdate(SessionImpl.java:499)
       at
bioinformatics.biojava.BriefLoader.loadNSave(BriefLoader.java:108)
       at bioinformatics.biojava.BriefLoader.main(BriefLoader.java:72)
Caused by: java.sql.SQLException: Duplicate entry 'genbankBiosqlRich'
for key 2
       at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:2975)
       at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1600)
       at
com.mysql.jdbc.ServerPreparedStatement.serverExecute(ServerPreparedStatement.java:1125) 



       at
com.mysql.jdbc.ServerPreparedStatement.executeInternal(ServerPreparedStatement.java:677) 



       at
com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1357)
       at
com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1274)
       at
com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1259)
       at
org.hibernate.id.IdentityGenerator$GetGeneratedKeysDelegate.executeAndExtract(IdentityGenerator.java:73) 



       at
org.hibernate.id.insert.AbstractReturningDelegate.performInsert(AbstractReturningDelegate.java:33) 





-- 
Doug Brown - Bioinformatics
Fungal Genomics Laboratory
Center for Integrated Fungal Research
North Carolina State University
Campus Box 7251, Raleigh, NC 27695-7251
https://www.fungalgenomics.ncsu.edu/~debrown/
Tel: (919) 513-0394, Fax (919) 513-0024
e-mail: doug_brown at ncsu.edu



More information about the Biojava-l mailing list