[Open-bio-l] schema change proposal for seqfeature & location

Hilmar Lapp hlapp@gnf.org
Tue, 30 Apr 2002 13:02:20 -0700


[I know I'm cross-posting every email; the one I did not got moved back to both lists. If the cross-posts annoy you, please let me know and I'll stop that immediately; it feels strange anyway :) ]

This is prompted by the possibility of remote locations for seqfeatures. Right now in BioSQL there is an extra table to hold an implicit FK of location to a bioentry (to be resolved at run-time). If you model this explicitly, it reduces to a nullable FK in Seqfeature_location to Bioentry. This makes that section almost cyclic. In addition, remote location features are going to be duplicated with this approach, or, depending on your viewpoint, worse, absent from bioentries to which they in fact map (because remote location features in fact map to more than 1 bioentry).

Note that if bioentries are contigs, features could easily be remote location features if they span across contig boundaries.

To avoid all this, I'd vote for the following:

1) Remove the FK to bioentry from seqfeature.
2) Add a NOT NULL FK to bioentry to seqfeature_location.
3) In addition, seqfeature_location should be typed.

Anyone else willing to be persuaded into this?

My gut feeling tells me that if I implement this in our own schema, I may not be able anymore to cleanly map back to BioSQL. I'm not sure though. 

	-hilmar
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp@gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------