[BioSQL-l] UK definition for seqfeature

Hilmar Lapp hlapp at gnf.org
Tue Mar 18 15:33:27 EST 2003


type_term_id is supposed to reference an SO term. source is supposed to 
denote the 'method'  (BLAST, BLAT, sim4, genewise, whatnot), as far as 
my understanding goes. In the case of reading the features from a 
GenBank feature table, assigning 'Genbank/EMBL/Swissprot' as the source 
(which is what the genbank, embl, and swissprot parsers do in bioperl) 
is maybe stretching the definition, but I don't have something 
substantially better to offer.

BTW there is no sensible default here. A term cannot have an empty name 
(once you step outside the neat MySQL world). If you want to press hard 
for a default, you could use 'unknown' or 'anonymous' (in an ontology 
of feature sources - hmm...)

	-hilmar

On Tuesday, March 18, 2003, at 04:42  AM, Aaron J Mackey wrote:

>
> Of course some will argue that type_term and source_term should really
> just be qualifier values, in a data-format agnostic viewpoint (and I've
> never really understood the whole type/source dichotomy; my GFF files
> are almost always wrong).
>
> But I have no objection to making it NOT NULL DEFAULT "" (term as 
> well).
>
> -Aaron
>
> On Mon, 17 Mar 2003, Hilmar Lapp wrote:
>
>> There is currently a compound unique key defined on seqfeature
>> comprising of (bioentry_id,type_term_id,source_term_id,rank).
>>
>> A while ago all but the first were nullable columns with the 
>> previously
>> known problems of different behaviour between RDBMSs. The situation is
>> much clearer defined now, as only source_term_id is left as nullable.
>>
>> Can anyone name reasons why we should leave this column nullable? 
>> I.e.,
>> can people think of sensible use cases where you would not have a
>> source_tag for a feature?
>>
>> I believe even in GFF source is mandatory ...
>>
>> 	-hilmar
>>
>>
>
> -- 
>  Aaron J Mackey
>  Pearson Laboratory
>  University of Virginia
>  (434) 924-2821
>  amackey at virginia.edu
>
>
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------



More information about the BioSQL-l mailing list