[BioSQL-l] a biosql/biojavax localization question

Hilmar Lapp hlapp at gmx.net
Wed Jul 5 12:47:05 UTC 2006


Alright - but was a nice try, no?

On Jul 5, 2006, at 5:51 AM, Richard Holland wrote:

> I think you should create it as you are the only one at present who
> knows what is already planned and what is not! :)
>
> cheers,
> Richard
>
> On Wed, 2006-07-05 at 00:04 -0400, Hilmar Lapp wrote:
>> On Jul 4, 2006, at 4:13 AM, Richard Holland wrote:
>>
>>> Personally I'd like to see *_qualifier_value tables for all BioSQL
>>> tables that represents an entity of any kind, be it term, feature,
>>> location, sequence, taxon, or anything else.
>>
>> I can see that making sense. Basically what it would say is that
>> every entity in BioSQL is derivable, as opposed to final, in an OO
>> sense.
>>
>> In fact, there aren't many entities that don't have a qualifier_value
>> association table yet. Adding one for biodatabase would have been in
>> my book of 1.1 changes as I use it in SymAtlas already.
>>
>>>
>>>
>>> In the case of is_taxon_hidden, this is specific to an individual
>>> taxon,
>>> and I can see cases where it would be appropriate to search by it  
>>> (for
>>> instance, pulling out all ancestors of a given taxon that are
>>> visible).
>>> So I think this should be an additional column.
>>
>> I would like to ask that a systematist. I have not seen it anywhere
>> else in a taxonomy other than NCBI's. I'm not convinced it's a good
>> idea to elevate NCBI's (or anybody else's) idiosyncrasies to columns
>> in the Bio* persistence interface.
>>
>>>
>>> By the way, is there a document somewhere detailing all the changes
>>> that
>>> are planned for 1.1?
>>
>> No, not yet. Good point though. Volunteers for starting one are
>> welcome ... :-)
>>
>> 	-hilmar
>>
>>
>>>
>>> cheers,
>>> Richard
>>>
>>>
>>> On Mon, 2006-07-03 at 14:07 -0400, Hilmar Lapp wrote:
>>>> Hi David, I wish I were in the south of France soaking up sun ...
>>>> although there is no shortage of sun (or heat for that matter, and
>>>> throw humidity in there too) where I am.
>>>>
>>>> Is_Circular is a general attribute that will apply to any sequence
>>>> (given the fact that many sequences are indeed circular). This, and
>>>> the fact that one may even want to search for it, would justify
>>>> inclusion directly as a column in the biosequence table.
>>>>
>>>> Is_Taxon_Hidden is one of those attributes that BioSQL by design
>>>> handles through attribute/value associations, that is, using  
>>>> ontology
>>>> term associations that have a value (the term is the attribute  
>>>> name).
>>>>
>>>> However, there is no taxon_qualifier_value table in BioSQL, so in
>>>> essence you are asking for adding that table.
>>>>
>>>> Does anybody else have ideas for taxon attributes for which this
>>>> table may be used?
>>>>
>>>> I don't really favor a proliferation of 'localized' versions of
>>>> BioSQL - this tends to defeat the purpose both of the rationale
>>>> behind a standardized persistence interface, as well as the  
>>>> design of
>>>> the schema for ultimate extensibility through weak typing and  
>>>> the use
>>>> of controlled vocabularies.
>>>>
>>>> Any thoughts to this end welcome.
>>>>
>>>> 	-hilmar
>>>>
>>>> On Jul 3, 2006, at 1:55 PM, David Scott wrote:
>>>>
>>>>> sure hilmar-
>>>>>
>>>>> in the genbank taxonomy file - nodes.dmp:
>>>>> ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdump_readme.txt
>>>>> there is a field:
>>>>>
>>>>> GenBank hidden flag (1 or 0)            -- 1 if name is suppressed
>>>>> in GenBank entry lineage
>>>>>
>>>>> this field controls whether the level is included in the taxonomy
>>>>> hierarchy when the genbank ORGANISM section is generated - but the
>>>>> more general problem trying to be solved is:
>>>>> o parse genbank entries
>>>>> o store parsed entry in biosql
>>>>> o pull parsed entry from biosql
>>>>> o (re)create the genbank entry
>>>>> o compare the recreated entry with the source document for
>>>>> identity. well - ok - almost identical.
>>>>>
>>>>> there are several parameters missing from biosql to make this
>>>>> possible. the general approach to a solution has been:
>>>>> o alter the biosql table to add a new column (a sql ddl file)
>>>>> o add a private get/set for the column in the biojavax object (a
>>>>> java file)
>>>>> o add the column to the biojavax hibernate o/r mapping (an xml  
>>>>> file)
>>>>>
>>>>> to help others that might have the same objective, and to
>>>>> accomodate those that don't wish these nonstandard columns  -  
>>>>> it is
>>>>> planned to release the o/r mapping files with the additional
>>>>> columns/fields commented out - these xml files along with the java
>>>>> files are checked out with cvs. it was not clear what to do with
>>>>> the ddl files - and it would be helpful to have them reviewed - no
>>>>> matter what is done with them.
>>>>>
>>>>> thanks for helping me - i just assumed you were late in responding
>>>>> because it is summer - and, well - you were in the the south of
>>>>> france soaking up the sun.
>>>>>
>>>>> looking to you for suggestions-
>>>>> david
>>>>>
>>>>>
>>>>> Hilmar Lapp wrote:
>>>>>> Hi David, sorry for dropping (or rather, not ever picking up) the
>>>>>> ball on this ... got lost in inbox stack.
>>>>>>
>>>>>> The earlier consensus was if I recall correctly to include
>>>>>> is_circular as a biosequence attribute in the 1.1 version.
>>>>>>
>>>>>> isTaxonHidden is new to me and I don't even understand what it
>>>>>> would mean. Can you elaborate?
>>>>>>
>>>>>>     -hilmar
>>>>>>
>>>>>> On Jun 21, 2006, at 11:19 AM, David Scott wrote:
>>>>>>
>>>>>>> biojavax is using hibernate to o/r map the biosql database to
>>>>>>> biojavax
>>>>>>> objects. biojavax is planning support in the biojavax objects  
>>>>>>> for
>>>>>>> fields
>>>>>>> not directly supported in the biosql database (e.g. isCircular,
>>>>>>> isTaxonHidden). in order to conform to the current biosql
>>>>>>> database, the
>>>>>>> default mapping file from biosql to biojavax will comment out  
>>>>>>> the
>>>>>>> unsupported fields (so the object fields will not be  
>>>>>>> initialized)
>>>>>>> and
>>>>>>> the objects will default an appropriate conforming value (e.g.
>>>>>>> false for
>>>>>>> isCircular and isTaxonHidden). for users wishing to localize
>>>>>>> biojavax:
>>>>>>> the user would uncomment the mapping file and alter the database
>>>>>>> tables.
>>>>>>> altering the database would require running ddl on the existing
>>>>>>> database
>>>>>>> to create the new table columns. what is the best way to review
>>>>>>> and then
>>>>>>> distribute the alter/create ddl for users to localize their
>>>>>>> database?
>>>>>>> _______________________________________________
>>>>>>> BioSQL-l mailing list
>>>>>>> BioSQL-l at lists.open-bio.org
>>>>>>> http://lists.open-bio.org/mailman/listinfo/biosql-l
>>>>>>>
>>>>>>
>>>>>> --===========================================================
>>>>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>>>>> ===========================================================
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>> -- 
>>> Richard Holland (BioMart Team)
>>> EMBL-EBI
>>> Wellcome Trust Genome Campus
>>> Hinxton
>>> Cambridge CB10 1SD
>>> UNITED KINGDOM
>>> Tel: +44-(0)1223-494416
>>>
>>
> -- 
> Richard Holland (BioMart Team)
> EMBL-EBI
> Wellcome Trust Genome Campus
> Hinxton
> Cambridge CB10 1SD
> UNITED KINGDOM
> Tel: +44-(0)1223-494416
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================








More information about the BioSQL-l mailing list