[BioSQL-l] a biosql/biojavax localization question
Hilmar Lapp
hlapp at gmx.net
Wed Jul 5 04:04:12 UTC 2006
On Jul 4, 2006, at 4:13 AM, Richard Holland wrote:
> Personally I'd like to see *_qualifier_value tables for all BioSQL
> tables that represents an entity of any kind, be it term, feature,
> location, sequence, taxon, or anything else.
I can see that making sense. Basically what it would say is that
every entity in BioSQL is derivable, as opposed to final, in an OO
sense.
In fact, there aren't many entities that don't have a qualifier_value
association table yet. Adding one for biodatabase would have been in
my book of 1.1 changes as I use it in SymAtlas already.
>
>
> In the case of is_taxon_hidden, this is specific to an individual
> taxon,
> and I can see cases where it would be appropriate to search by it (for
> instance, pulling out all ancestors of a given taxon that are
> visible).
> So I think this should be an additional column.
I would like to ask that a systematist. I have not seen it anywhere
else in a taxonomy other than NCBI's. I'm not convinced it's a good
idea to elevate NCBI's (or anybody else's) idiosyncrasies to columns
in the Bio* persistence interface.
>
> By the way, is there a document somewhere detailing all the changes
> that
> are planned for 1.1?
No, not yet. Good point though. Volunteers for starting one are
welcome ... :-)
-hilmar
>
> cheers,
> Richard
>
>
> On Mon, 2006-07-03 at 14:07 -0400, Hilmar Lapp wrote:
>> Hi David, I wish I were in the south of France soaking up sun ...
>> although there is no shortage of sun (or heat for that matter, and
>> throw humidity in there too) where I am.
>>
>> Is_Circular is a general attribute that will apply to any sequence
>> (given the fact that many sequences are indeed circular). This, and
>> the fact that one may even want to search for it, would justify
>> inclusion directly as a column in the biosequence table.
>>
>> Is_Taxon_Hidden is one of those attributes that BioSQL by design
>> handles through attribute/value associations, that is, using ontology
>> term associations that have a value (the term is the attribute name).
>>
>> However, there is no taxon_qualifier_value table in BioSQL, so in
>> essence you are asking for adding that table.
>>
>> Does anybody else have ideas for taxon attributes for which this
>> table may be used?
>>
>> I don't really favor a proliferation of 'localized' versions of
>> BioSQL - this tends to defeat the purpose both of the rationale
>> behind a standardized persistence interface, as well as the design of
>> the schema for ultimate extensibility through weak typing and the use
>> of controlled vocabularies.
>>
>> Any thoughts to this end welcome.
>>
>> -hilmar
>>
>> On Jul 3, 2006, at 1:55 PM, David Scott wrote:
>>
>>> sure hilmar-
>>>
>>> in the genbank taxonomy file - nodes.dmp:
>>> ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdump_readme.txt
>>> there is a field:
>>>
>>> GenBank hidden flag (1 or 0) -- 1 if name is suppressed
>>> in GenBank entry lineage
>>>
>>> this field controls whether the level is included in the taxonomy
>>> hierarchy when the genbank ORGANISM section is generated - but the
>>> more general problem trying to be solved is:
>>> o parse genbank entries
>>> o store parsed entry in biosql
>>> o pull parsed entry from biosql
>>> o (re)create the genbank entry
>>> o compare the recreated entry with the source document for
>>> identity. well - ok - almost identical.
>>>
>>> there are several parameters missing from biosql to make this
>>> possible. the general approach to a solution has been:
>>> o alter the biosql table to add a new column (a sql ddl file)
>>> o add a private get/set for the column in the biojavax object (a
>>> java file)
>>> o add the column to the biojavax hibernate o/r mapping (an xml file)
>>>
>>> to help others that might have the same objective, and to
>>> accomodate those that don't wish these nonstandard columns - it is
>>> planned to release the o/r mapping files with the additional
>>> columns/fields commented out - these xml files along with the java
>>> files are checked out with cvs. it was not clear what to do with
>>> the ddl files - and it would be helpful to have them reviewed - no
>>> matter what is done with them.
>>>
>>> thanks for helping me - i just assumed you were late in responding
>>> because it is summer - and, well - you were in the the south of
>>> france soaking up the sun.
>>>
>>> looking to you for suggestions-
>>> david
>>>
>>>
>>> Hilmar Lapp wrote:
>>>> Hi David, sorry for dropping (or rather, not ever picking up) the
>>>> ball on this ... got lost in inbox stack.
>>>>
>>>> The earlier consensus was if I recall correctly to include
>>>> is_circular as a biosequence attribute in the 1.1 version.
>>>>
>>>> isTaxonHidden is new to me and I don't even understand what it
>>>> would mean. Can you elaborate?
>>>>
>>>> -hilmar
>>>>
>>>> On Jun 21, 2006, at 11:19 AM, David Scott wrote:
>>>>
>>>>> biojavax is using hibernate to o/r map the biosql database to
>>>>> biojavax
>>>>> objects. biojavax is planning support in the biojavax objects for
>>>>> fields
>>>>> not directly supported in the biosql database (e.g. isCircular,
>>>>> isTaxonHidden). in order to conform to the current biosql
>>>>> database, the
>>>>> default mapping file from biosql to biojavax will comment out the
>>>>> unsupported fields (so the object fields will not be initialized)
>>>>> and
>>>>> the objects will default an appropriate conforming value (e.g.
>>>>> false for
>>>>> isCircular and isTaxonHidden). for users wishing to localize
>>>>> biojavax:
>>>>> the user would uncomment the mapping file and alter the database
>>>>> tables.
>>>>> altering the database would require running ddl on the existing
>>>>> database
>>>>> to create the new table columns. what is the best way to review
>>>>> and then
>>>>> distribute the alter/create ddl for users to localize their
>>>>> database?
>>>>> _______________________________________________
>>>>> BioSQL-l mailing list
>>>>> BioSQL-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/biosql-l
>>>>>
>>>>
>>>> --===========================================================
>>>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>>>> ===========================================================
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
> --
> Richard Holland (BioMart Team)
> EMBL-EBI
> Wellcome Trust Genome Campus
> Hinxton
> Cambridge CB10 1SD
> UNITED KINGDOM
> Tel: +44-(0)1223-494416
>
--
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
===========================================================
More information about the BioSQL-l
mailing list