[Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing taxon entries in lineage

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Thu Apr 10 12:07:32 UTC 2008


http://bugzilla.open-bio.org/show_bug.cgi?id=2475





------- Comment #27 from biopython-bugzilla at maubp.freeserve.co.uk  2008-04-10 08:07 EST -------
Regarding inserting the lineage into the taxon/taxon_tables, the tree structure
is stored in two ways.  Firstly, using the taxon.parent field, and secondly
using the left/right fields.

Over on the BioSQL mailing list we've established that updating the left/right
values by recalulating them takes about 10 minutes - doing this from Biopython
when adding a new sequence does not seem ideal.

We could add missing taxonomy nodes to the tables (based on the Bio.Entrez
data), and record the tree structure using the taxon.parent field, but leave
the left/right values as NULL.

This should be enough for Biopython to recover the full linege when retrieving
a sequence - we need to check BioSQL.BioSeq._retrieve_taxon() is happy.

If the user wants the left/right values, they would have to (re)run the BioSQL
load_ncbi_taxonomy.pl script (which is slow).


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list