[BioSQL-l] DBXRef

Hilmar Lapp hlapp@gnf.org
Wed, 18 Sep 2002 16:49:38 -0700


You can still link to it. The real difference is you are encouraged 
to spend some more brainpower on how to make your dbxref recoverable 
at a later time, recoverable meaning pulling up the physical record 
it points to, be that remote or local.

Consider the following views (assume there's a table 
bioentry_relationship with FKs to bioentry [source], bioentry 
[target], and ontology_term):

CREATE VIEW dbxref
AS
SELECT
	e.bioentry_id  dbxref_id,
     e.accession    accession,
     db.name        dbname
FROM bioentry e, biodatabase db
WHERE e.biodatabase_id = db.biodatabase_id;

CREATE VIEW bioentry_dblink
AS
SELECT
     er.src_bioentry_id  bioentry_id
     er.tgt_bioentry_id  dbxref_id
FROM bioentry es, bioentry et, bioentry_relationship er,
      ontology_term ot
WHERE es.bioentry_id = er.src_bioentry_id
AND   et.bioentry_id = er.tgt_bioentry_id
AND   ot.ontology_term_id = er.ontology_term_id
AND   ot.term_name = 'dbxref';

So I can (could if I had views in MySQL) easily emulate these under 
your feet.

You'd have skeleton bioentries (having accession, possibly a 
version, and a FK to biodatabase) for entries that are purely refs 
to remote seqs. Once you load the 'real' entry, you'd update the 
skeleton. The problem with this is matching up the database name 
(namespace) between skeleton and 'real' entry in order to find that 
there's the skeleton already and you just need to update that. 
However, that's not a problem newly introduced: if you want to match 
up your dbxref against a bioentry, you'd have to do this as well. 
That's why I'm saying with the bioentry-bioentry association you are 
encouraged to think about some consistent namespace naming 
beforehand instead of once you discover that you have trouble 
matching dbxrefs to bioentries.

The problem with regard to the question you're raising is rather how 
do you know the sequence actually sits on a remote server and not in 
your biosql db. I guess you'd try to pull out anything more than the 
accession, and if you fail it sits remote. And the present dbxref 
doesn't help you with that question either.

At least as far as I understand it. Am I missing something?

	-hilmar

On Wednesday, September 18, 2002, at 03:50 PM, Matthew Pocock wrote:

> Hi Hilmar,
>
> If DBXref is replaced with joins between entries, then we loose the 
> ability to link from a biosql-serialized sequence to a remote 
> sequence (e.g. from a biosql image of an embl entry to a 
> web-service image of a swiss-prot entry). Or, imagine you have 
> imported the muman division of embl into biosql, how would you 
> represent links to the non-human entries? Perhaps both mechanisms 
> could be used, butg then we are providing multiple ways to 
> represent the same information. Mmm. Don't know.
>
> Matthew
>
> Hilmar Lapp wrote:
>> I propose to convert the implicit foreign keys to bioentries 
>> captured as DBXref into explicit foreign keys to Bioentry.
>> This enables integrity checking by the database, fast and simple 
>> joins, and provides for easy annotation/decoration of DBXrefs 
>> (because they'd become no less annotatable as any other Bioentry).
>> At the same time, this change would give rise to a 
>> Bioentry-Bioentry association table, which we need anyway to 
>> represent relationships between bioentries, one of the central 
>> concepts of the database we're building here. These associations 
>> would be typed.
>> To extend this, eventually these relationships will also need to 
>> reference evidence. If anyone has thoughts on this please share. 
>> This is not our most immediate problem though.
>>     -hilmar
>> -- -------------------------------------------------------------
>> Hilmar Lapp                            email: lapp at gnf.org
>> GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
>> -------------------------------------------------------------
>> _______________________________________________
>> BioSQL-l mailing list
>> BioSQL-l@open-bio.org
>> http://open-bio.org/mailman/listinfo/biosql-l
>
>
> -- BioJava Consulting LTD - Support and training for BioJava
> http://www.biojava.co.uk
>
>
--
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------