[Bioperl-l] Bio::Tree development

Johan Viklund johan.viklund at gmail.com
Fri Jan 19 18:20:20 UTC 2007


Hi,

I've started using Bio::Tree{,IO} and would very much like to update
it (it was I who filed bug 2191
<http://bugzilla.open-bio.org/show_bug.cgi?id=2191>).

I have listed below the main issues I think should be addressed,
(there might be more). I just thought I could get some
comments/suggestions on this, and if you're happy with this I could
start working on it. Shouldn't take so long time.


== Bugs (or what I think is bugs) ==

* Spelling of Descendent.
I think old (misspelled) names should be reatained for
backwards-compability, but they should be aliases. Or the old methods
can be kept and they delegate to the new ones with a warning (I've
seen this in other places of BioPerl).

* Unifying the newick writing between the nexus.pm and newick.pm in TreeIO
I think it's a bit strange having two implementations for this. For
parsing  newickstrings there's only one (in newick.pm).

Other nice things this would bring is adding the ability to sort the
nodes  in the nexus output, now this is only possible when writing
newick-files. There might be other slight differences too (I haven't
checked).

* Remove reverse_edge() from Node.pm
It calls the nonexsistant delete_edge() (which should be roughly
equivalent to remove_Descendent()), and I believe that this is an old
helper function for the old reroot method (as I noted in the above
bugreport).

* Move get_leaf_nodes() from TreeI.pm to NodeI.pm
Quite obvious, I often only want to do stuff on leaf_nodes from a
particular node in a tree, this would be much clearer than having to
write
    grep { $_->is_Leaf } $node->get_all_Descendents;
all the time. Reimplement the method in TreeI.pm roughly like this:
    $self->get_root_node->get_leaf_nodes();



== Additions ==

* More Tests
In part to reflect any changes, and also to increase the coverage of our tests.

* Better Kualitee

* Iterators
A couple of iterators or tree walk methods/classes for trees. This
comes in handy when one wants to annotate tree nodes in different
ways. As a bare minimum I would think pre-order, in-order and
post-order iterators should be implemented. This would also simplify
the different write_tree() methods I think.

What would the most bioperly way of implementing an Iterator be?

* Implement TreeIO/tgf.pm
Parser for the TreeGraph format.



== Some minor bugs ==

* Node->Id (minor bug)
For some reason the Id gets set to the bootstrap value for internal
nodes, I find this a bit annoying. I think that the internal_id would
be better.

* General code cleanup
Making sure everything is indented according to some standard. I've
seen previously that there doesn't seem to be any real standard for
how BioPerl code should look like. I would think that it would be a
lot clearer to understand lots of the code if it was indented
properly. As it is now, the indentation depth changes between 2,3 and
4 within the same file even.

* get_Descendents()
Undocumented and works, I thought it was each_Descendent()-like, but
it was an alias for get_all_Descendents(), highly confusing. Should at
least be documented, maybe it's an old remnant...

* Naming convensions in BioPerl
What are they, sometimes methods look_like_this() ans sometimes they
look_like_This(), what's the general rule for when to use capital
letters in the beginning of a word (in Bio::Seq there's even a
get_SeqFeatures() )? It seems like there are capital letters in a name
when there's another BioPerl class/object involved, but I'm not sure
(is_Leaf in Node.pm doesn't follow this).


-- 
Johan Viklund
PhD Student
Molecular Evolution
EBC, Uppsala University
Norbyvägen 18C
SE-752 36  Uppsala
Sweden
phone +46(0)18-471 64 03




More information about the Bioperl-l mailing list