[Biojava-dev] BioJava Nightly Build + BioSQL/MySQL problem...
Thomas Down
td2 at sanger.ac.uk
Tue Aug 3 13:49:01 EDT 2004
Hi Michael,
I think I've found the problem -- it's actually documented in a comment
in BioSQLSequenceDB.intern_ontology_term method, but probably ought to
be somewhere more obvious...
The issue seems to be that MySQL ignores the case of strings when
enforcing uniqueness constraint, but Genbank files contain the key
ORGANISM in both upper- and lower-case variants, so when BioJava tries
to store both of these in the term table it runs into trouble.
The solution seems to be to create the term.name field as BINARY rather
than VARCHAR. If you make this change to your schema and recreate the
database, everything should be fine.
Presumably this issue has come up before (although I can't remember it
myself, I always used PostgreSQL in my BioSQL days). Has anyone tried
to feed this change back into the official mysql schema?
Thomas.
On 3 Aug 2004, at 18:07, Michael Griffith wrote:
> Hi,
>
> I am trying to use the BioJava 1.4 nightly build (8.3.2004) to read a
> GeneBank File and insert it into a BIOSQL/MySQL db.
>
> My code basically is doing this:
>
> // Connecting to the BioSQL DB
> SequenceDB db = new BioSQLSequenceDB(dbDriver, dbURL, dbUser, dbPass,
> biodatabase, create);
>
> // Reading a GeneBank flat file
> SequenceIterator iter =
> (SequenceIterator)SeqIOTools.fileToBiojava(format,
> alpha, br);
> while (iter.hasNext()) {
>
> Sequence seq = iter.nextSequence();
>
> try {
> db.addSequence(seq);
> }
> catch (Exception e) {
> e.printStackTrace();
> }
> ...
> }
>
> It progresses a little ways and I get the following error stack:
>
> [java] Caused by: java.sql.SQLException: Couldn't create term
> 'ORGANISM'
> for 'ORGANISM' in legacy ontology namespace
> [java] at
> org.biojava.bio.seq.db.biosql.BioSQLSequenceDB.intern_ontology_term(Bio
> SQLSe
> quenceDB.java:942)
> [java] at
> org.biojava.bio.seq.db.biosql.BioSQLSequenceDB.persistBioentryProperty(
> BioSQ
> LSequenceDB.java:894)
> [java] at
> org.biojava.bio.seq.db.biosql.BioSQLSequenceDB._addSequence(BioSQLSeque
> nceDB
> .java:485)
> [java] ... 2 more
> [java] Caused by: org.biojava.bio.BioRuntimeException: Error
> commiting
> to BioSQL tables (rolled back successfully)
> [java] at
> org.biojava.bio.seq.db.biosql.OntologySQL.persistTerm(OntologySQL.java:
> 536)
> [java] at
> org.biojava.bio.seq.db.biosql.OntologySQL.access$200(OntologySQL.java:
> 61)
> [java] at
> org.biojava.bio.seq.db.biosql.OntologySQL$OntologyMonitor.postChange(On
> tolog
> ySQL.java:503)
> [java] at
> org.biojava.utils.ChangeSupport.firePostChangeEvent(ChangeSupport.java:
> 338)
> [java] at
> org.biojava.ontology.Ontology$Impl.addTerm(Ontology.java:349)
> [java] at
> org.biojava.ontology.Ontology$Impl.createTerm(Ontology.java:358)
> [java] at
> org.biojava.bio.seq.db.biosql.BioSQLSequenceDB.intern_ontology_term(Bio
> SQLSe
> quenceDB.java:938)
> [java] ... 4 more
> [java] Caused by: java.sql.SQLException: Failed to persist term:
> ORGANISM from ontology: ontology: __biojava_guano with error: 1062 :
> 23000
> [java] at
> org.biojava.bio.seq.db.biosql.OntologySQL.persistTerm(OntologySQL.java:
> 562)
> [java] at
> org.biojava.bio.seq.db.biosql.OntologySQL.persistTerm(OntologySQL.java:
> 524)
> [java] ... 10 more
> [java] Caused by: java.sql.SQLException: Duplicate key or
> integrity
> constraint violation, message from server: "Duplicate entry
> 'ORGANISM-5'
> for key 2"
> [java] at
> com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:1977)
> [java] at
> com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1163)
> [java] at
> com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:1272)
> [java] at
> com.mysql.jdbc.Connection.execSQL(Connection.java:2236)
> [java] at
> com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:
> 1741)
> [java] at
> com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:
> 1588)
> [java] at
> org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(Deleg
> ating
> PreparedStatement.java:233)
> [java] at
> org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(Deleg
> ating
> PreparedStatement.java:233)
> [java] at
> org.biojava.bio.seq.db.biosql.OntologySQL.persistTerm(OntologySQL.java:
> 554)
> [java] ... 11 more
> [java] Exception in thread "main" org.biojava.bio.BioError: Error
> looking up biosqlized ID for ORGANISM
> [java] at
> org.biojava.bio.seq.db.biosql.OntologySQL.termID(OntologySQL.java:684)
> [java] at
> org.biojava.bio.seq.db.biosql.BioSQLSequenceDB.intern_ontology_term(Bio
> SQLSe
> quenceDB.java:927)
> [java] at
> org.biojava.bio.seq.db.biosql.BioSQLSequenceDB.persistBioentryProperty(
> BioSQ
> LSequenceDB.java:894)
> [java] at
> org.biojava.bio.seq.db.biosql.BioSQLSequenceDB._addSequence(BioSQLSeque
> nceDB
> .java:485)
> [java] at
> org.biojava.bio.seq.db.biosql.BioSQLSequenceDB.addSequence(BioSQLSequen
> ceDB.
> java:365)
> [java] at
> com.gts.genebank.GeneralReader.main(GeneralReader.java:64)
> [java] Caused by: java.lang.NullPointerException
> [java] at
> org.biojava.bio.seq.db.biosql.OntologySQL.termID(OntologySQL.java:682)
> [java] ... 5 more
> [java] Java Result: 1
>
>
> What am I doing wrong? Any help would be greatly appreciated!
>
> MG
>
> _______________________________________________
> biojava-dev mailing list
> biojava-dev at biojava.org
> http://biojava.org/mailman/listinfo/biojava-dev
>
More information about the biojava-dev
mailing list