[Biopython-dev] Bio.Cluster.Tree -> Bio.Phylo

Eric Talevich eric.talevich at gmail.com
Tue Apr 17 15:25:35 UTC 2012


It would be useful to have a quick and portable function for
distance-based tree estimation in Bio.Phylo, since otherwise it's
necessary to use one of the wrappers for external programs in
Bio.Phylo.Applications. (And currently, only PhyML is wrapped.) Does
the hierarchical clustering algorithm in Bio.Cluster correspond to any
common tree-estimation algorithm, e.g. UPGMA? If so, then it would
make a lot of sense to provide the glue for using it that way. If you
have done some work in this direction, I would be happy to see it.


On Mon, Apr 16, 2012 at 6:47 PM, Andrew Sczesnak
<andrew.sczesnak at med.nyu.edu> wrote:
> Eric,
> I can describe two use cases from my own experience. First, the MAF parser
> I've been working on can pull the multiple alignment of some gene between a
> bunch of genomes. Thinking of recipes for the cookbook, I thought it would
> be neat to walk the user through constructing a distance matrix by hand
> (though you're right--more could be done to support this), clustering with
> Bio.Cluster and visualizing the result with Bio.Phylo. I like this example
> because it integrates several different parts of BioPython along with a
> lesson about inferring distances between sequences.
> Second, for another project, I've been generating distance matrices based on
> the shared gene content of bacterial genomes and the presence-or-absence of
> orthologous groups in each. Presently, I ferry the matrices to a clustering
> program and then visualize the resulting trees in yet another tool. Looking
> into ways of streamlining this brought me back to Bio.Cluster, Bio.Phylo and
> the incompatibility of their tree objects.
> I wonder, what would be the most elegant way of bridging the gap?
> Best,
> Andrew

More information about the Biopython-dev mailing list