[Biopython-dev] [Bug 3045] TreeMixin, please define enumerator and other convenience methods

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Tue Apr 6 23:34:43 UTC 2010


http://bugzilla.open-bio.org/show_bug.cgi?id=3045





------- Comment #2 from eric.talevich at gmail.com  2010-04-06 19:34 EST -------
(In reply to comment #1)
> Interesting. I don't really see the need for most of these given how routine
> use of the enumerate function is elsewhere in Python. So -1 on the enumerate
> methods.

I'll make sure all of the use cases can be handled with a simple list
comprehension, at least.

> 
> I'm not fond of the name of the existing method get_terminals (which currently
> returns a list). My feeling is that using just terminals seems nicer (as a
> property, so no order argument - if you need that use the find method). Is
> there any advantage to returning a list vs an iterator? Everything is all in
> memory anyway, right?

I took the method names from Bio.Nexus.Trees wherever it seemed reasonable --
one day I'd like Bio.Phylo to be a drop-in replacement for that module (as much
as possible). Otherwise I'd be fine with a method called terminals().

The tree object doesn't keep a list of terminal nodes under the hood, so to get
the terminal nodes it does a full search of the tree, with run time linear to
the number of nodes in the tree. I feel uneasy about properties that don't run
in O(1) time.

The find* methods return iterators, and the get* methods return lists. I found
that the results of get* usually needed to be converted to a list immediately,
for indexing or length-checking, and aren't liable to be unexpectedly large --
smaller than the whole tree, anyway. Plus, get_terminals() is really just a
shortcut for list(tree.find_clades(terminal=True)), for those who prefer to
dive into the module or save some typing.


> Given a terminals property (be it a read only list or an iterator), one might
> go further and add a sister property for the internal nodes (non-terminal
> nodes).

Apparently there's some demand for it. It would be the same as
list(tree.find_clades(terminal=False)), and forcing users to learn how find_*
methods work after they're hooked on get_terminals() has some appeal, but I
suppose we should just pick a name and add it.


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list