[BioSQL-l] Bioperl and BioSQL status
Hilmar Lapp
hlapp at gnf.org
Wed Apr 2 01:13:22 EST 2003
On Tuesday, April 1, 2003, at 12:54 PM, William Hsiao wrote:
> Hi,
> My lab is interested in adapting BioSQL as the basis
> for a functional genomic database that will support
> microarray analysis we wish to perform in the near
> future. The types of information we wish to include
> in the database include pathway, signal peptide,
> transcription factors, binding sites, promotors,
> protein domains, signal peptides, subcellular
> localization information, etc. The database will need
> to accommodate both eukaryotic and prokaryotic
> genomes/genes, and will need to be flexible to
> accommodate future analysis results that we may
> perform. I have taken a look at the BioSQL schema,
> and from the available documentation, the schema,
> theoretically, can accommodate the different types of
> information.
I can imagine that it does, with the exception of pathways. Even though
I'm sure you could squeeze them in after some compromises, but from
what the schema was originally conceived for, pathway annotation is a
pretty far stretch.
> However, it might be more reasonable to
> extend the current schema to suit the specific types
> of information to simplify insertion and update.
> First, I am wondering if anyone has any suggestions on
> the best (or good) approach to house the data types I
> mentioned above (i.e. the current BioSQL schema design
> is not strong typed, is it better to use a strong
> typed database (e.g. GUS) for storing such
> information)?
It's a matter of taste, and also how you will want to work with your
data. A strongly typed relational model makes it more difficult to
query across types, or join types, as compared to a schema typed
through ontology (hence, not through the relational model) that
intrinsically can provide you with a unified view. As Chris pointed
out, typing through an ontology means the RDBMS can't enforce it.
> More specifically, I am wondering if we
> decided to add additional tables to the schema (but
> keep the original tables in tact), will that break the
> bioperl modules (bioperl-db, etc) that are associated
> with BioSQL?
No it won't. The question is though how you want to populate and query
those tables. If through SQL only, your tables not being recognized in
bioperl-db doesn't bother you. If you have classes (perl modules) that
correspond to those additional tables in a relatively straightforward
way, it is not too difficult to extend the BioSQL adaptors in
bioperl-db with your own object persistence adaptors. Basically, if
your class is SFU::Promoter with a corresponding promoter table, you'd
write a module Bio::DB::BioSQL::PromoterAdaptor that inherits from
Bio::DB::BioSQL::BasePersistenceAdaptor and then implements the methods
declared abstract in BasePersistenceAdaptor.
> Second, if we add more columns (fields)
> to the existing tables, will that break bioperl-db?
No. Only if you delete columns.
>
> Is BioPerl adaptor for BioSQL designed to accommodate
> the possibility that the actual schema might be
> expanded?
Actually yes. It was even designed for the possibility that the actual
schema is different, but that remains to be proven. (There is two
layers of adaptors: the first is the object persistence adaptor, which
handles the object-layer and persistence-related business logic. The
second works as a driver for a particular schema and translates objects
into table names and attributes into column names.)
Apart from this, Chris gave a good overview on the breadth of options,
so I'll just refer to his posting.
-hilmar
>
> Thank you
>
> William Hsiao
> Brinkman Laboratory, SFU
>
>
> ______________________________________________________________________
> Post your free ad now! http://personals.yahoo.ca
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at open-bio.org
> http://open-bio.org/mailman/listinfo/biosql-l
>
--
-------------------------------------------------------------
Hilmar Lapp email: lapp at gnf.org
GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
-------------------------------------------------------------
More information about the BioSQL-l
mailing list