[BioSQL-l] The bioentry and biosequence tables

Wed Dec 17 14:49:28 UTC 2008

On Dec 17, 2008, at 9:32 AM, Peter wrote:

> Was there a reason for not just putting optional sequence, length and
> alphabet fields into the bioentry table directly (instead of having a
> separate biosequence table)?

The bioentry table is for any biological database entry with a stable  
and unique identifier.

In practical terms most of these will be sequence database entries,  
but they don't have to be. Among the not so far fetched examples are  
gene records/models (such as from LocusLink or Entrez Gene) and (e.g.  
EST or protein) sequence clusters.

A (at present) more exotic example would be museum specimen records.

> Does doing it as a separate table speed up accessing the core (non- 
> sequence) bioentry information?

Possibly. To what extent will depend on the RDBMS, obviously. But at  
least several years ago, when BioSQL was first designed, some RDBMSs  
would indeed be faster in full table scans if the table didn't contain  
an LOB.

The way to look at it conceptually for us has been similar to object  
orientation. The bioentry table is the base table for all biodatabase  
entries, and the biosequence table is in joined for derived objects  
that also have a sequence.

Does that make sense?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================