[BioSQL-l] Consistency between bio* projects

Fri Jan 14 12:58:15 EST 2005

On Friday, January 14, 2005, at 01:10  AM, 
mark.schreiber at group.novartis.com wrote:
>  Unfortunately, Bioperl stores identifiers as
> follows:
>
> Bioentry.bioentry_id is the unique internal reference number
> Bioentry.name is the GI number

The GI number goes to Bioentry.Identifier, which is was designated the 
purpose of storing the identifier within an external database.

Bioentry.name should hold the locus name, which for contigs and many 
other entries etc will be identical to the accession (but not the GI 
number!).

If you find it in Bioentry.name then I suspect you weren't loading from 
genbank or embl formatted input?

 From memory the basic idea of BioSQL was to define a schema that bio*
> projects could both read and write from in a language independant 
> manner.
> For reasons best left to the designers (mostly I think cause MySQL
> couldn't handle stored proceedures) the level of interaction is right 
> down
> at the schema level.

Right. Also, not all database drivers in all languages support stored 
procedure calls equally well. In e.g. PostgreSQL and Oracle you can 
always get around this by writing a view and putting an INSTEAD OF 
INSERT (or UPDATE) trigger on it that will then call the procedure, but 
this is clearly not even close to an option in MySQL.

It's maybe worth considering whether opening a dichotomy here between 
MySQL and the rest to provide people who need it with a SQL-level API 
that both perl and java will use. People who are interested in this by 
definition will not be interested in MySQL anyway.

>  Unfortunaltey this means that the way data is stored
> needs to be very consistent between projects if any API's that use 
> BioSQL
> can be portable. My biojava API cannot be applied to a DB previously 
> setup
> with bioperl which was the original idea behind BioSQL in the first 
> place.
>
> Help!!

I think you're raising a great point. Indeed, such a contract hasn't 
really been written. We're probably one of few who use both perl and 
java to access a biosql database (and I'm not using biojava as the 
object model on the java side, which is why I'm not running into this 
problem). (Note as an aside that you could also write adaptors that 
transform between the SymGene and the Biojava model when storing or 
retrieving objects from/to the database.)

It'd be great if you were willing to take the lead for getting this all 
spelled out and laid down in a document?

	-hilmar
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------