[BioSQL-l] Seqfeature_Source

Hilmar Lapp hlapp@gnf.org
Thu, 19 Sep 2002 11:05:58 -0700


On Thursday, September 19, 2002, at 06:29 AM, Thomas Down wrote:

> On Wed, Sep 18, 2002 at 11:53:39AM -0700, Hilmar Lapp wrote:
>>
>
>>> Couple of note:
>>>
>>>   - I'm not hard nosed about this, and am open to persuasion
>>>     (especially if it really does make things run a lot faster).
>>>
>>
>> My point is not only about this being potentially faster. My point
>> was more targeted at the present design being non-intuitive, and not
>> justified by any of the Bio* object models AFAIK.
>
> Do you expect many people to be writing code which talks directly
> to BioSQL?

Yes and no. First I believe every other use case than round-tripping 
eventually calls for the ability to issue more complex queries. What 
the adaptors allow you to do is not necessarily going to be enough, 
or fast enough. Data mining through adaptor calls is a nice idea, 
but will need the adaptors to evolve a way. Until then, you'll want 
to write and test SQL queries.

Counter-intuitiveness also hits people who (want to) join adaptor 
code development (like me myself :) ...

the whole thing is not finished yet is it and it won't be for a long 
while ... I think it can only help if the schema itself has as low a 
learning curve as possible so that the activation barrier for people 
to jump in is not higher than is inevitable ... And people will come 
from the object model of one of the Bio* projects.

>   Even with the seqfeature_source inlined into that
> table, you still need to perform several other joins to get
> anything useful out of it.  My assumption has always been that
> people will query BioSQL through one of a small number of
> adaptor libraries.
>
> (Another approach to making BioSQL user-friendly is the set of
> GFF-ish views which Chris Mungall came up with)
>

True.

>> That's exactly what I thought converting it to. If not an attribute,
>> then I think this is what it should be.
>>
>> The only disadvantage implementation-wise is that then you have two
>> FKs from seqfeature to ontology_term, hence you have to name them
>> differently and auto-determining their name is not as standard
>> anymore. I'd like to get the whole thing working before that
>> change ...
>
> Would it be such a bad thing?
>
> There's already at least one case in BioSQL (seqfeature_relationship)
> of a table having more than 1 FK to the same table.  Is breaking
> the naming convention such a bad thing

Absolutely not. I'm just in the process of trying to get a minimal 
functionality working and it is perfectly clear to me that at some 
point solving 2 FKs to the same entity is unavoidable ... I rather 
made a plea if I could make this an attribute as an interim solution 
to make my life easier in the short term ...


> , especially if it's clarified
> by an explicit REFERENCES constraint in the DDL?
>
>> Another possibility is to turn the ontology_term - seqfeature FK
>> into an association.
>
> What do you mean by this?  Are you suggesting having a bridge
> table such that any seqfeature can be associated with any number
> of ontology_terms?

Yes.

>   In principle, that might not be such a bad
> idea, but I don't think it fits the object models for any of
> the current Bio* projects.  It's a little closer to some proposed
> stuff for BioJava2, but still not quite the same...
>

What it would basically reflect is that SeqFeatures are annotatable, 
too. Not with objects though, but with terms.

Aren't seqfeatures annotatable in BioJava?

	-hilmar
--
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------