[Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing taxon entries in lineage

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Fri May 9 18:20:12 UTC 2008


http://bugzilla.open-bio.org/show_bug.cgi?id=2475





------- Comment #31 from mmokrejs at ribosome.natur.cuni.cz  2008-05-09 14:20 EST -------
Hi,
  I wanted to test what you have but lack some more user friendly
documentation. Specifically, I lack documentation for the class BioSeqDatabase
in BioSeqDatabase.py (attachment 915). In the method load which Eric has
modified it is not clear to me what would be fetched from NCBI Taxonomy DB. I
guess the full lineage, but still I do not know whether as a string or a list
of strings or similarly just taxids?

  The Loader.py (attachment 914) has scary function called remove()
and I would like to see moro elaborate explanation what it really does.
Imagine I have two subspecies of same species in the database want
to delete the first one. Will it zap the parents common to both
of them? I wish not. ;-)

Also, I am a bit surprised that _get_taxon_id() would actually modify a local
database. Could there be another name of could it be split into two functions,
one doing the search ove local db, and optionally fetching data via internet
and second modifying local db?

And, shouldn't the 'if self.fetch_NCBI_taxonomy' have a corresponding elif for
the second attempt and the third one? It is a bit too long to read. ;-)


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list