[Biojava-dev] problems with biosql

Matthew Pocock matthew_pocock at yahoo.co.uk
Wed Sep 24 14:11:09 EDT 2003


OK.

I'm using postgresql and loading in complete genome EMBL files. It's 
possible that it is due to case clashes. The method 
BioSQLSequenceDB.intern_ontolgy_term() does seem to be doing case tricks 
- will the world explode if I dissable this? Anyway, I've modified the 
exception messages now so that they tell me more stuff. Hopefully that 
should make this easier to track.

Matthew

Simon Foote wrote:

> Hi Mathew,
>
> Which database server are you using?
> I ran into a similar problem importing Genbank files into a MySQL 
> database as Genbank can contain terms that are the same, but have 
> different cases.  Off the top of my head, I traced this to either the 
> unique indexing of the term table in the database doesn't take the 
> case into account or the persistant storage of the terms in a map 
> doesn't take the case into account.  Although, I can't remember 
> whether the keys in a map take case into account.
>
> There is a comment line in BioSQLSequenceDB.java around line 850
> //System.err.println("Term: " + ts + "   " + ex.getMessage());
> That will give you the offending term if uncommented.
>
> The hack I added above that code block worked in all my Genbank cases, 
> but I never did test it with EMBL files.
>
> It maybe time to think about removing this legacy ontology code and 
> using whatever is supposed to replace it.
> I think Thomas coded the original stuff.
>
> Simon
>




More information about the biojava-dev mailing list