[BioSQL-l] TAXON,TAXON_NAME, was Re: Description

Chris Fields cjfields at uiuc.edu
Tue Sep 11 16:16:08 UTC 2007


I think one area of possible headache will be TAXON/TAXON_NAME.  For  
instance, with BioPerl we kept running into genus/species parsing  
problems (virus, bacterial names) when going from seqrecord->object.   
Due to that we decided to greatly simplify Species parsing in Bioperl  
so there isn't any 'guessing' as to genus/species names; you get  
what's already there, nothing more.  If one wants extra taxonomic  
information then one must use NCBI Taxonomy somehow.

However, currently bioperl-db still splits into genus/species (acts  
like older BioPerl), which obviously clashes with current Bioperl  
behavior.  Not sure how the other Bio* store this data; Richard?

There is a BioPerl bug filed on this:

http://bugzilla.open-bio.org/show_bug.cgi?id=2092

chris

On Sep 11, 2007, at 10:49 AM, Barry Moore wrote:

> Well, the schema is the formal specification as to what goes where  
> and as long as your BioJava and BioPerl DB interface plays by the  
> rules of the schema, then yes you should be able to use both  
> languages on the same database.  Of course the devil is in the  
> details and since I've only worked with the BioPerl interface I  
> don't know if that is in fact reality right now.  I think what  
> Richard meant was there is not detailed human documentation about  
> where each bit of a GenBank record goes into what table and  
> column.  Paul, I think you will find this document to be what you  
> are looking for - or at least as good as you'll get:  go to http:// 
> cvs.open-bio.org/cgi-bin/viewcvs/viewcvs.cgi/biosql-schema/doc/? 
> cvsroot=biosql and look for schema-overview.txt.  There is also a  
> ERD in pdf format which can help you get your head around the  
> schema.  If you end up with specific questions about what's where,  
> send another e-mail or just load some files and go exploring.
>
> Barry
>
> On Sep 11, 2007, at 9:10 AM, Chris Fields wrote:
>
>> Here's a question I couldn't find the answer to: should any BioSQL-
>> loaded data (via BioJava, BioPerl, etc) be expected to fully round
>> trip across any BioSQL-utilizing language?  In other words, if I use
>> BioJava/Hibernate to load sequence data in to a BioSQL database and
>> use BioPerl to work with the data, can one expect it to work?
>>
>> My guess is no, as long as there is no formal specification...
>>
>> chris
>>
>> On Sep 11, 2007, at 9:54 AM, Richard Holland wrote:
>>
>>> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA1
>>>
>>> There is no formal specification for what goes where in BioSQL, but
>>> you
>>> can refer to the BioJava documentation for a good approximation of
>>> where
>>> a GenBank file should end up. The BioJava objects share similar
>>> names to
>>> the BioSQL tables and are mapped using Hibernate.
>>>
>>> The most useful parts of the docs are probably:
>>>
>>> http://biojava.org/wiki/BioJava:BioJavaXDocs#GenBank
>>>
>>> and:
>>>
>>> http://biojava.org/wiki/BioJava:BioJavaXDocs#Hibernate_object-
>>> relational_mappings.
>>>
>>> cheers,
>>> Richard
>>>
>>> Paul Davis wrote:
>>>> I've been going over the biosql schema and I was wondering if there
>>>> was a good place to read about examples of actual data that goes  
>>>> into
>>>> each table. Specifically, I'm a bit confused about which parts of a
>>>> genbank record go in which tables.
>>>>
>>>> Thanks,
>>>> Paul Davis
>>>> _______________________________________________
>>>> BioSQL-l mailing list
>>>> BioSQL-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/biosql-l
>>>>
>>> -----BEGIN PGP SIGNATURE-----
>>> Version: GnuPG v1.4.2.2 (GNU/Linux)
>>> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>>>
>>> iD8DBQFG5qw64C5LeMEKA/QRAiAPAJ41b3+cO7LQc1F4nAFrUWsVLwbl8wCgjFvd
>>> Q8i8g2bUyB17L++fuSKXa+0=
>>> =q8G2
>>> -----END PGP SIGNATURE-----
>>> _______________________________________________
>>> BioSQL-l mailing list
>>> BioSQL-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/biosql-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> BioSQL-l mailing list
>> BioSQL-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biosql-l
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign






More information about the BioSQL-l mailing list