[Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing taxon entries in lineage

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Wed Mar 26 12:44:01 UTC 2008


http://bugzilla.open-bio.org/show_bug.cgi?id=2475





------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk  2008-03-26 08:44 EST -------
Created an attachment (id=883)
 --> (http://bugzilla.open-bio.org/attachment.cgi?id=883&action=view)
Patch to BioSQL/Loader.py to not record the lineage for new species

This patch takes the simple route out - when loading a sequence into the
database with a new species (not already in the taxon tables), we ONLY add the
new species to the taxon and taxon_name tables.  This DOES NOT attempt to
record the whole lineage, adding or reusing existing taxon entries.

Both the test_BioSQL and test_BioSQL_SeqIO unit tests still pass with this.

I prefer this solution as it avoids any ambiguous heuristics in matching
existing taxon names based on string comparions.  This does mean Biopython
won't match BioPerl is this regard, as I understand that BioPerl currently
tries to record the full lineage.


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list