[BioSQL-l] TAXON,TAXON_NAME, was Re: Description

Paul Davis paul.joseph.davis at gmail.com
Thu Sep 13 00:13:31 UTC 2007


I glanced through the bioperl cvs a bit but couldn't find the part
where it tries to load a new taxonomy name. Does this go and try to
rebuild the nested sets information, or basically leave any inserted
taxonomic data (non-NCBI data) as nodes dangling outside the nested
sets information?

Paul

On 9/12/07, Hilmar Lapp <hlapp at gmx.net> wrote:
> The species/taxon handling shouldn't be a problem if you have the
> NCBI taxonID and have preloaded the NCBI taxonomy.
>
> However, if it's a new species (i.e., the lookup of the NCBI taxonID
> in the taxon table fails), then bioperl-db tries to create the
> lineage based on what it finds in the species object.
>
> As the bug report says, the issue can be fixed, but it also looks
> like the fix will break compatibility with earlier versions of
> BioPerl. I think at some point that's fine, but I was wondering
> whether that's the way it needs to be.
>
>         -hilmar
>
> On Sep 11, 2007, at 12:16 PM, Chris Fields wrote:
>
> > I think one area of possible headache will be TAXON/TAXON_NAME.  For
> > instance, with BioPerl we kept running into genus/species parsing
> > problems (virus, bacterial names) when going from seqrecord->object.
> > Due to that we decided to greatly simplify Species parsing in Bioperl
> > so there isn't any 'guessing' as to genus/species names; you get
> > what's already there, nothing more.  If one wants extra taxonomic
> > information then one must use NCBI Taxonomy somehow.
> >
> > However, currently bioperl-db still splits into genus/species (acts
> > like older BioPerl), which obviously clashes with current Bioperl
> > behavior.  Not sure how the other Bio* store this data; Richard?
> >
> > There is a BioPerl bug filed on this:
> >
> > http://bugzilla.open-bio.org/show_bug.cgi?id=2092
> >
> > chris
> >
> > On Sep 11, 2007, at 10:49 AM, Barry Moore wrote:
> >
> >> Well, the schema is the formal specification as to what goes where
> >> and as long as your BioJava and BioPerl DB interface plays by the
> >> rules of the schema, then yes you should be able to use both
> >> languages on the same database.  Of course the devil is in the
> >> details and since I've only worked with the BioPerl interface I
> >> don't know if that is in fact reality right now.  I think what
> >> Richard meant was there is not detailed human documentation about
> >> where each bit of a GenBank record goes into what table and
> >> column.  Paul, I think you will find this document to be what you
> >> are looking for - or at least as good as you'll get:  go to http://
> >> cvs.open-bio.org/cgi-bin/viewcvs/viewcvs.cgi/biosql-schema/doc/?
> >> cvsroot=biosql and look for schema-overview.txt.  There is also a
> >> ERD in pdf format which can help you get your head around the
> >> schema.  If you end up with specific questions about what's where,
> >> send another e-mail or just load some files and go exploring.
> >>
> >> Barry
> >>
> >> On Sep 11, 2007, at 9:10 AM, Chris Fields wrote:
> >>
> >>> Here's a question I couldn't find the answer to: should any BioSQL-
> >>> loaded data (via BioJava, BioPerl, etc) be expected to fully round
> >>> trip across any BioSQL-utilizing language?  In other words, if I use
> >>> BioJava/Hibernate to load sequence data in to a BioSQL database and
> >>> use BioPerl to work with the data, can one expect it to work?
> >>>
> >>> My guess is no, as long as there is no formal specification...
> >>>
> >>> chris
> >>>
> >>> On Sep 11, 2007, at 9:54 AM, Richard Holland wrote:
> >>>
> >>>> -----BEGIN PGP SIGNED MESSAGE-----
> >>>> Hash: SHA1
> >>>>
> >>>> There is no formal specification for what goes where in BioSQL, but
> >>>> you
> >>>> can refer to the BioJava documentation for a good approximation of
> >>>> where
> >>>> a GenBank file should end up. The BioJava objects share similar
> >>>> names to
> >>>> the BioSQL tables and are mapped using Hibernate.
> >>>>
> >>>> The most useful parts of the docs are probably:
> >>>>
> >>>> http://biojava.org/wiki/BioJava:BioJavaXDocs#GenBank
> >>>>
> >>>> and:
> >>>>
> >>>> http://biojava.org/wiki/BioJava:BioJavaXDocs#Hibernate_object-
> >>>> relational_mappings.
> >>>>
> >>>> cheers,
> >>>> Richard
> >>>>
> >>>> Paul Davis wrote:
> >>>>> I've been going over the biosql schema and I was wondering if
> >>>>> there
> >>>>> was a good place to read about examples of actual data that goes
> >>>>> into
> >>>>> each table. Specifically, I'm a bit confused about which parts
> >>>>> of a
> >>>>> genbank record go in which tables.
> >>>>>
> >>>>> Thanks,
> >>>>> Paul Davis
> >>>>> _______________________________________________
> >>>>> BioSQL-l mailing list
> >>>>> BioSQL-l at lists.open-bio.org
> >>>>> http://lists.open-bio.org/mailman/listinfo/biosql-l
> >>>>>
> >>>> -----BEGIN PGP SIGNATURE-----
> >>>> Version: GnuPG v1.4.2.2 (GNU/Linux)
> >>>> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
> >>>>
> >>>> iD8DBQFG5qw64C5LeMEKA/QRAiAPAJ41b3+cO7LQc1F4nAFrUWsVLwbl8wCgjFvd
> >>>> Q8i8g2bUyB17L++fuSKXa+0=
> >>>> =q8G2
> >>>> -----END PGP SIGNATURE-----
> >>>> _______________________________________________
> >>>> BioSQL-l mailing list
> >>>> BioSQL-l at lists.open-bio.org
> >>>> http://lists.open-bio.org/mailman/listinfo/biosql-l
> >>>
> >>> Christopher Fields
> >>> Postdoctoral Researcher
> >>> Lab of Dr. Robert Switzer
> >>> Dept of Biochemistry
> >>> University of Illinois Urbana-Champaign
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> BioSQL-l mailing list
> >>> BioSQL-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/biosql-l
> >>
> >
> > Christopher Fields
> > Postdoctoral Researcher
> > Lab of Dr. Robert Switzer
> > Dept of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> > _______________________________________________
> > BioSQL-l mailing list
> > BioSQL-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biosql-l
>
> --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l
>



More information about the BioSQL-l mailing list