[BioSQL-l] a biosql/biojavax localization question

Richard Holland richard.holland at ebi.ac.uk
Wed Jul 5 09:51:35 UTC 2006


I think you should create it as you are the only one at present who
knows what is already planned and what is not! :)

cheers,
Richard

On Wed, 2006-07-05 at 00:04 -0400, Hilmar Lapp wrote:
> On Jul 4, 2006, at 4:13 AM, Richard Holland wrote:
> 
> > Personally I'd like to see *_qualifier_value tables for all BioSQL
> > tables that represents an entity of any kind, be it term, feature,
> > location, sequence, taxon, or anything else.
> 
> I can see that making sense. Basically what it would say is that  
> every entity in BioSQL is derivable, as opposed to final, in an OO  
> sense.
> 
> In fact, there aren't many entities that don't have a qualifier_value  
> association table yet. Adding one for biodatabase would have been in  
> my book of 1.1 changes as I use it in SymAtlas already.
> 
> >
> >
> > In the case of is_taxon_hidden, this is specific to an individual  
> > taxon,
> > and I can see cases where it would be appropriate to search by it (for
> > instance, pulling out all ancestors of a given taxon that are  
> > visible).
> > So I think this should be an additional column.
> 
> I would like to ask that a systematist. I have not seen it anywhere  
> else in a taxonomy other than NCBI's. I'm not convinced it's a good  
> idea to elevate NCBI's (or anybody else's) idiosyncrasies to columns  
> in the Bio* persistence interface.
> 
> >
> > By the way, is there a document somewhere detailing all the changes  
> > that
> > are planned for 1.1?
> 
> No, not yet. Good point though. Volunteers for starting one are  
> welcome ... :-)
> 
> 	-hilmar
> 
> 
> >
> > cheers,
> > Richard
> >
> >
> > On Mon, 2006-07-03 at 14:07 -0400, Hilmar Lapp wrote:
> >> Hi David, I wish I were in the south of France soaking up sun ...
> >> although there is no shortage of sun (or heat for that matter, and
> >> throw humidity in there too) where I am.
> >>
> >> Is_Circular is a general attribute that will apply to any sequence
> >> (given the fact that many sequences are indeed circular). This, and
> >> the fact that one may even want to search for it, would justify
> >> inclusion directly as a column in the biosequence table.
> >>
> >> Is_Taxon_Hidden is one of those attributes that BioSQL by design
> >> handles through attribute/value associations, that is, using ontology
> >> term associations that have a value (the term is the attribute name).
> >>
> >> However, there is no taxon_qualifier_value table in BioSQL, so in
> >> essence you are asking for adding that table.
> >>
> >> Does anybody else have ideas for taxon attributes for which this
> >> table may be used?
> >>
> >> I don't really favor a proliferation of 'localized' versions of
> >> BioSQL - this tends to defeat the purpose both of the rationale
> >> behind a standardized persistence interface, as well as the design of
> >> the schema for ultimate extensibility through weak typing and the use
> >> of controlled vocabularies.
> >>
> >> Any thoughts to this end welcome.
> >>
> >> 	-hilmar
> >>
> >> On Jul 3, 2006, at 1:55 PM, David Scott wrote:
> >>
> >>> sure hilmar-
> >>>
> >>> in the genbank taxonomy file - nodes.dmp:
> >>> ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdump_readme.txt
> >>> there is a field:
> >>>
> >>> GenBank hidden flag (1 or 0)            -- 1 if name is suppressed
> >>> in GenBank entry lineage
> >>>
> >>> this field controls whether the level is included in the taxonomy
> >>> hierarchy when the genbank ORGANISM section is generated - but the
> >>> more general problem trying to be solved is:
> >>> o parse genbank entries
> >>> o store parsed entry in biosql
> >>> o pull parsed entry from biosql
> >>> o (re)create the genbank entry
> >>> o compare the recreated entry with the source document for
> >>> identity. well - ok - almost identical.
> >>>
> >>> there are several parameters missing from biosql to make this
> >>> possible. the general approach to a solution has been:
> >>> o alter the biosql table to add a new column (a sql ddl file)
> >>> o add a private get/set for the column in the biojavax object (a
> >>> java file)
> >>> o add the column to the biojavax hibernate o/r mapping (an xml file)
> >>>
> >>> to help others that might have the same objective, and to
> >>> accomodate those that don't wish these nonstandard columns  - it is
> >>> planned to release the o/r mapping files with the additional
> >>> columns/fields commented out - these xml files along with the java
> >>> files are checked out with cvs. it was not clear what to do with
> >>> the ddl files - and it would be helpful to have them reviewed - no
> >>> matter what is done with them.
> >>>
> >>> thanks for helping me - i just assumed you were late in responding
> >>> because it is summer - and, well - you were in the the south of
> >>> france soaking up the sun.
> >>>
> >>> looking to you for suggestions-
> >>> david
> >>>
> >>>
> >>> Hilmar Lapp wrote:
> >>>> Hi David, sorry for dropping (or rather, not ever picking up) the
> >>>> ball on this ... got lost in inbox stack.
> >>>>
> >>>> The earlier consensus was if I recall correctly to include
> >>>> is_circular as a biosequence attribute in the 1.1 version.
> >>>>
> >>>> isTaxonHidden is new to me and I don't even understand what it
> >>>> would mean. Can you elaborate?
> >>>>
> >>>>     -hilmar
> >>>>
> >>>> On Jun 21, 2006, at 11:19 AM, David Scott wrote:
> >>>>
> >>>>> biojavax is using hibernate to o/r map the biosql database to
> >>>>> biojavax
> >>>>> objects. biojavax is planning support in the biojavax objects for
> >>>>> fields
> >>>>> not directly supported in the biosql database (e.g. isCircular,
> >>>>> isTaxonHidden). in order to conform to the current biosql
> >>>>> database, the
> >>>>> default mapping file from biosql to biojavax will comment out the
> >>>>> unsupported fields (so the object fields will not be initialized)
> >>>>> and
> >>>>> the objects will default an appropriate conforming value (e.g.
> >>>>> false for
> >>>>> isCircular and isTaxonHidden). for users wishing to localize
> >>>>> biojavax:
> >>>>> the user would uncomment the mapping file and alter the database
> >>>>> tables.
> >>>>> altering the database would require running ddl on the existing
> >>>>> database
> >>>>> to create the new table columns. what is the best way to review
> >>>>> and then
> >>>>> distribute the alter/create ddl for users to localize their
> >>>>> database?
> >>>>> _______________________________________________
> >>>>> BioSQL-l mailing list
> >>>>> BioSQL-l at lists.open-bio.org
> >>>>> http://lists.open-bio.org/mailman/listinfo/biosql-l
> >>>>>
> >>>>
> >>>> --===========================================================
> >>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> >>>> ===========================================================
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>
> > -- 
> > Richard Holland (BioMart Team)
> > EMBL-EBI
> > Wellcome Trust Genome Campus
> > Hinxton
> > Cambridge CB10 1SD
> > UNITED KINGDOM
> > Tel: +44-(0)1223-494416
> >
> 
-- 
Richard Holland (BioMart Team)
EMBL-EBI
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
UNITED KINGDOM
Tel: +44-(0)1223-494416



More information about the BioSQL-l mailing list