[BioSQL-l] TAXON,TAXON_NAME, was Re: Description

Richard Holland holland at ebi.ac.uk
Wed Sep 12 07:32:42 UTC 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

We use the taxon table to store taxon information. See:

http://biojava.org/wiki/BioJava:BioJavaXDocs#NCBI_Taxonomy.

Each RichSequence object then gets an NCBITaxon object associated with
it using set/getTaxon(). For Genbank this is always parsed from the
appropriate entry in the feature table - the Organism and Species lines
are ignored.

cheers,
Richard

Chris Fields wrote:
> I think one area of possible headache will be TAXON/TAXON_NAME.  For
> instance, with BioPerl we kept running into genus/species parsing
> problems (virus, bacterial names) when going from seqrecord->object. 
> Due to that we decided to greatly simplify Species parsing in Bioperl so
> there isn't any 'guessing' as to genus/species names; you get what's
> already there, nothing more.  If one wants extra taxonomic information
> then one must use NCBI Taxonomy somehow.
> 
> However, currently bioperl-db still splits into genus/species (acts like
> older BioPerl), which obviously clashes with current Bioperl behavior. 
> Not sure how the other Bio* store this data; Richard?
> 
> There is a BioPerl bug filed on this:
> 
> http://bugzilla.open-bio.org/show_bug.cgi?id=2092
> 
> chris
> 
> On Sep 11, 2007, at 10:49 AM, Barry Moore wrote:
> 
>> Well, the schema is the formal specification as to what goes where and
>> as long as your BioJava and BioPerl DB interface plays by the rules of
>> the schema, then yes you should be able to use both languages on the
>> same database.  Of course the devil is in the details and since I've
>> only worked with the BioPerl interface I don't know if that is in fact
>> reality right now.  I think what Richard meant was there is not
>> detailed human documentation about where each bit of a GenBank record
>> goes into what table and column.  Paul, I think you will find this
>> document to be what you are looking for - or at least as good as
>> you'll get:  go to
>> http://cvs.open-bio.org/cgi-bin/viewcvs/viewcvs.cgi/biosql-schema/doc/?cvsroot=biosql
>> and look for schema-overview.txt.  There is also a ERD in pdf format
>> which can help you get your head around the schema.  If you end up
>> with specific questions about what's where, send another e-mail or
>> just load some files and go exploring.
>>
>> Barry
>>
>> On Sep 11, 2007, at 9:10 AM, Chris Fields wrote:
>>
>>> Here's a question I couldn't find the answer to: should any BioSQL-
>>> loaded data (via BioJava, BioPerl, etc) be expected to fully round
>>> trip across any BioSQL-utilizing language?  In other words, if I use
>>> BioJava/Hibernate to load sequence data in to a BioSQL database and
>>> use BioPerl to work with the data, can one expect it to work?
>>>
>>> My guess is no, as long as there is no formal specification...
>>>
>>> chris
>>>
>>> On Sep 11, 2007, at 9:54 AM, Richard Holland wrote:
>>>
> There is no formal specification for what goes where in BioSQL, but
> you
> can refer to the BioJava documentation for a good approximation of
> where
> a GenBank file should end up. The BioJava objects share similar
> names to
> the BioSQL tables and are mapped using Hibernate.
> 
> The most useful parts of the docs are probably:
> 
> http://biojava.org/wiki/BioJava:BioJavaXDocs#GenBank
> 
> and:
> 
> http://biojava.org/wiki/BioJava:BioJavaXDocs#Hibernate_object-
> relational_mappings.
> 
> cheers,
> Richard
> 
> Paul Davis wrote:
>>>>>> I've been going over the biosql schema and I was wondering if there
>>>>>> was a good place to read about examples of actual data that goes into
>>>>>> each table. Specifically, I'm a bit confused about which parts of a
>>>>>> genbank record go in which tables.
>>>>>>
>>>>>> Thanks,
>>>>>> Paul Davis
>>>>>> _______________________________________________
>>>>>> BioSQL-l mailing list
>>>>>> BioSQL-l at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/biosql-l
>>>>>>
_______________________________________________
BioSQL-l mailing list
BioSQL-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biosql-l
>>>
>>> Christopher Fields
>>> Postdoctoral Researcher
>>> Lab of Dr. Robert Switzer
>>> Dept of Biochemistry
>>> University of Illinois Urbana-Champaign
>>>
>>>
>>>
>>> _______________________________________________
>>> BioSQL-l mailing list
>>> BioSQL-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/biosql-l
>>

> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign



-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG55YZ4C5LeMEKA/QRAg7wAJwPa7GXHKSdaYVHrk9a3JM8GhLIHwCeLRSq
jaQ6oAARv+oOpuaeBhNSA2U=
=xc8y
-----END PGP SIGNATURE-----



More information about the BioSQL-l mailing list