[Open-bio-l] BioSQL schema: some questions

Hilmar Lapp hlapp@gnf.org
Fri, 26 Apr 2002 09:29:27 -0700


Cross-posting this one, as the GMOD	people probably have extensive experience with solving my problem.

> -----Original Message-----
> From: Hilmar Lapp 
> Sent: Thursday, April 25, 2002 9:07 PM
> To: OBDA BioSQL (E-mail)
> Subject: [Open-bio-l] BioSQL schema: some questions
> 
> 
[...]
> 
> I'm wondering how I would 'correctly' represent a mapping of, 
> e.g., Celera transcripts (Bioentries?) onto the Ensembl assembly.
> 

As another example that I pretty much guess has been solved in Ensembl, how do I store a mapping between a RefSeq sequence and the Ensembl assembly, such that I can query which RefSeq sequences hit which Ensembl transcripts where and with which coverage. Similary for swissprot, genpept, celera transcripts, etc, and as a cross-product, you get the idea.

It appears in biosql only features can map to something else with locations, with something else being one bioentry, or things referenced by Remote_Seqfeature_Name.accession. So, features cannot location-map to other features? Would the assembly's contigs then be the bioentries, and a match to a, say, RefSeq sequence is a feature on that contig?

I realize I need to look at the Ensembl schema in more detail; maybe the answer is just obvious from there.

	-hilmar
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp@gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------