[Bioperl-l] Re: [BioSQL-l] Bioperl and BioSQL status

Hilmar Lapp hlapp at gnf.org
Wed Apr 2 01:13:22 EST 2003


On Tuesday, April 1, 2003, at 12:54  PM, William Hsiao wrote:

> Hi,
>   My lab is interested in adapting BioSQL as the basis
> for a functional genomic database that will support
> microarray analysis we wish to perform in the near
> future.  The types of information we wish to include
> in the database include pathway, signal peptide,
> transcription factors, binding sites, promotors,
> protein domains, signal peptides, subcellular
> localization information, etc.  The database will need
> to accommodate both eukaryotic and prokaryotic
> genomes/genes, and will need to be flexible to
> accommodate future analysis results that we may
> perform.  I have taken a look at the BioSQL schema,
> and from the available documentation, the schema,
> theoretically, can accommodate the different types of
> information.

I can imagine that it does, with the exception of pathways. Even though 
I'm sure you could squeeze them in after some compromises, but from 
what the schema was originally conceived for, pathway annotation is a 
pretty far stretch.


>   However, it might be more reasonable to
> extend the current schema to suit the specific types
> of information to simplify insertion and update.
> First, I am wondering if anyone has any suggestions on
> the best (or good) approach to house the data types I
> mentioned above (i.e. the current BioSQL schema design
> is not strong typed, is it better to use a strong
> typed database (e.g. GUS) for storing such
> information)?

It's a matter of taste, and also how you will want to work with your 
data. A strongly typed relational model makes it more difficult to 
query across types, or join types, as compared to a schema typed 
through ontology (hence, not through the relational model) that 
intrinsically can provide you with a unified view. As Chris pointed 
out, typing through an ontology means the RDBMS can't enforce it.


>   More specifically, I am wondering if we
> decided to add additional tables to the schema (but
> keep the original tables in tact), will that break the
> bioperl modules (bioperl-db, etc) that are associated
> with BioSQL?

No it won't. The question is though how you want to populate and query 
those tables. If through SQL only, your tables not being recognized in 
bioperl-db doesn't bother you. If you have classes (perl modules) that 
correspond to those additional tables in a relatively straightforward 
way, it is not too difficult to extend the BioSQL adaptors in 
bioperl-db with your own object persistence adaptors. Basically, if 
your class is SFU::Promoter with a corresponding promoter table, you'd 
write a module Bio::DB::BioSQL::PromoterAdaptor that inherits from 
Bio::DB::BioSQL::BasePersistenceAdaptor and then implements the methods 
declared abstract in BasePersistenceAdaptor.

>   Second, if we add more columns (fields)
> to the existing tables, will that break bioperl-db?

No. Only if you delete columns.

>
> Is BioPerl adaptor for BioSQL designed to accommodate
> the possibility that the actual schema might be
> expanded?

Actually yes. It was even designed for the possibility that the actual 
schema is different, but that remains to be proven. (There is two 
layers of adaptors: the first is the object persistence adaptor, which 
handles the object-layer and persistence-related business logic. The 
second works as a driver for a particular schema and translates objects 
into table names and attributes into column names.)

Apart from this, Chris gave a good overview on the breadth of options, 
so I'll just refer to his posting.

	-hilmar

>
> Thank you
>
> William Hsiao
> Brinkman Laboratory, SFU
>
>
> ______________________________________________________________________
> Post your free ad now! http://personals.yahoo.ca
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at open-bio.org
> http://open-bio.org/mailman/listinfo/biosql-l
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------



More information about the Bioperl-l mailing list