[Biopython-dev] Rerooting a tree with Bio.Phylo

Fri Mar 26 11:38:56 UTC 2010

Hi Eric and Peter,

On 25 March 2010 20:27, Eric Talevich <eric.talevich at gmail.com> wrote:

> On Wed, Mar 24, 2010 at 11:16 AM, Peter <biopython at maubp.freeserve.co.uk
> >wrote:
>
> > On Mon, Mar 22, 2010 at 9:48 PM, Peter <biopython at maubp.freeserve.co.uk>
> > wrote:
> > >> In Bio.Nexus, would you normally have handled this with the method
> > >> root_with_outgroup? I intend to port that method to Bio.Phylo once I
> > >> understand it, but the existing code has been kind of hard for me to
> > figure
> > >> out.
> > >
> > > I've just got a quick answer for you now tonight: I've not used
> Bio.Nexus
> > > to try and do this - I'll try to get back to you in more depth
> tomorrow.
> >
> > Here is an example using Bio.Nexus.Trees to reroot with an outgroup.
> >
> > [...]
> >
> > In my example, the outgroup originally has a branch length of 0.00145.
> > A new root node was created (here #12) with two children, one with a
> > branch length of zero (#5, the outgroup) and one with the full length
> > (#3, branch length 0.00145). Essentially this new root node (#12) and
> > the outgroup (#5) are now both right at the base of the tree.
> >
> > There is more than one what to do this though. For example FigTree
> > seems to introduce a new root node half way along the outgroup branch
> > (replacing the edge with two edges of half its length). This way the
> > new root node represents the last common ancestor of the outgroup and
> > the ingroup (everything else), although putting it at the mid point is
> > perhaps a little arbitrary.
>

Yes, what FigTree is doing is arbitrary, it introduces information into the
displayed tree that is not present, and is open to misinterpretation. But
it's doing so purely for the graphical presentation because you are trying
to root on a terminal branch. Thankfully, if you save this tree in FigTree
it writes the original trifurcating tree.

> I looked up this section in *Inferring Phylogenies* and found no decisive
> statement on how it should be done. I gathered:
>
> 1. The new root can be placed anywhere along the branch between the
> outgroup
> and its ancestor.
>

The root may in biological reality be anywhere along that branch but, in the
absence of further information, the question is where do you place it in
this situation ie, rooting (making a bifurcating root node) on that terminal
branch.

> 2. Another way to root a tree is by assuming a molecular clock -- place the
> root so that the distances to all the tips are roughly equal.
>
> So FigTree and Bio.Nexus are both doing reasonable things. (PyCogent
> doesn't
> seem to support this operation, as far as I can tell.)
>
> Thinking of this operation as extending the tree further back in time,
> where
> the (monophyletic) tree without the outgroup is a sub-clade of the larger
> rooted tree we're introducing -- it makes sense to me that the branch
> length
> of the outgroup should represent the total evolutionary distance from the
> root of the monophyletic sub-clade to the outgroup.

Yes, the outgroup taxa are included in analyses to orientate the
relationships (including br lens) of the ingroup. In this case, with a
single outgroup taxon you do not a very good estimate of the ingroup br len
(its presumably not the immediate ancestor of the ingroup), but its all
you've got given the way the experiment was set up - including more
outgroups would have been a good idea.

Based on that, I'm
> tempted to do the opposite of Bio.Nexus,

Curious, because given that I think Bio.Nexus is doing the right thing ;) By
using this function you are rooting (making a dichotomous root node) using
an outgroup (1 taxon in this case), and the biological interpretation is
that the length belongs to the ingroup.

letting the outgroup keep its
> original branch length, and assigning a length of 0 to the branch leading
> to
> the remaining sub-clade.  Then by default we get something resembling a
> trifucating root, and the user can shift the actual location of the root
> further back without too much difficulty.
>

I dont understand what you are getting at here...

Other points:

They way that FigTree displays the rooted tree from root_with_outgroup() is
how I would expect the tree to be presented if you only had a single
outgroup taxon.

There is a case to be made for not making a dichotomous root, but making the
nearest trifurcating node to the designated outgroup the root node - this is
what PAUP does (it wont write at dichotomously rooted tree even if you tell
it to root it).

I think the whole problem stems from only having a single outgroup (which
when you root to it ends up 'looking' like the immediate ancestor of the
ingroup). Typically, you would include multiple ougroups and present/display
the tree with a trifurcating root node, one of which lineages is the ingroup
- unless you are using a non-reversible model you dont need dichotomously
rooted trees.

Cheers, C.

--