[Biopython] Phylo: rerooting a tree with a terminal node

Robert Beiko beiko at cs.dal.ca
Thu Jan 13 11:48:32 EST 2011


Hi Eric,

I applied the fix and many of the terminal rootings now work. Thanks!

I am still getting errors on a smaller subset of trees, though. The 
simplest example is this one (in a file called Example1.tre):

(A,B,(C,D));

I have modified test.py to do things slightly differently:

--------------------------

import io
import sys
from Bio import Phylo

REROOT_INTERNAL = 0

infile = 'Example1.tre'
trees = Phylo.parse(infile,'newick')

for tree in trees:
     print "\nInitial tree:"
     Phylo.write(tree,sys.stdout,'newick')

     if REROOT_INTERNAL == 1:
         for iNode in tree.get_nonterminals():
             if len(tree.get_path(iNode)) > 1:
                 tree.root_with_outgroup(iNode)
                 break

         print "\nRerooted with nice internal node:"
         Phylo.write(tree,sys.stdout,'newick')

     leafList = tree.get_terminals()
     print "\nAttempting to root on terminal " + leafList[0].name
     tree.root_with_outgroup(leafList[0])

     print "Rerooted on terminal:"
     Phylo.write(tree,sys.stdout,'newick')

-------------------------------

[Apologies for all the print statements and C-like constants]

If REROOT_INTERNAL is set to 0, then we go right to the 'leafList = 
tree.get_terminals()' line and get the following error:

Initial tree:
(A:1.00000,B:1.00000,(C:1.00000,D:1.00000)0.00000:1.00000)0.00000:1.00000;

Attempting to root on terminal A
Traceback (most recent call last):
   File "C:\Projects\10000\Phylogenomics\TestTrees\test.py", line 25, in 
<module>
     tree.root_with_outgroup(leafList[0])
   File "C:\Python26\lib\site-packages\Bio\Phylo\BaseTree.py", line 768, 
in root_with_outgroup
     parent = outgroup_path.pop(-2)
IndexError: pop index out of range

which I assume occurs either because the terminal node (A) has the root 
as its immediate parent, and/or because it's one leaf from an initial 
trifurcation. Note that ((A,B),(C,D)); works fine in this case.

Setting REROOT_INTERNAL to 1 is my attempt to get around this problem by 
first rooting on a 'safe' internal node, and then rooting on the 
terminal. On many of the larger trees I am working with, this solves the 
problem. But in the case of the tree above, it seems that the original 
trifurcation remains in place. A few of the larger trees I have tested 
also retain this trifurcation, even if branches are moved around. 
Setting REROOT_INTERNAL to 1 gives me the following output:

Initial tree:
(A:1.00000,B:1.00000,(C:1.00000,D:1.00000)0.00000:1.00000)0.00000:1.00000;

Rerooted with nice internal node:
(A:1.00000,B:1.00000,(C:1.00000,D:1.00000)0.00000:1.00000)0.00000:1.00000;

Attempting to root on terminal A
Traceback (most recent call last):
   File "C:\Projects\10000\Phylogenomics\TestTrees\test.py", line 25, in 
<module>
     tree.root_with_outgroup(leafList[0])
   File "C:\Python26\lib\site-packages\Bio\Phylo\BaseTree.py", line 768, 
in root_with_outgroup
     parent = outgroup_path.pop(-2)
IndexError: pop index out of range

Now, if I change the line

for iNode in tree.get_nonterminals():

to

for iNode in tree.get_terminals():

Then I get the desired behaviour. This succeeds as a workaround as long 
as I check to make sure that I'm not rooting again on the terminal that 
I just rooted on.

Best wishes,
Rob

On 12/01/2011 9:55 PM, Eric Talevich wrote:
> Hi Rob,
>
> This was an outright bug in Bio.Phylo, so thanks again for reporting 
> it. I've pushed a fix to GitHub:
> https://github.com/biopython/biopython/commit/1a8a39b6d24a9a4b9088255327b0f2fd12c19a09
>
> For your own work, you can get this fix by either:
> (a) checking out a development copy of Biopython from GitHub (the 
> master branch is fairly safe) and reinstalling, or
> (b) applying just this fix to your copy of Bio.Phylo in-place -- i.e. 
> editing your existing Biopython installation. You can replace the file 
> Bio/Phylo/BaseTree.py with the one from GitHub without any ill effects.
>
> Cheers,
> Eric
>
> On Wed, Jan 12, 2011 at 1:46 PM, Robert Beiko <beiko at cs.dal.ca 
> <mailto:beiko at cs.dal.ca>> wrote:
>
>     Hi Eric,
>
>     Thank you very much for your quick reply.
>
>     Indeed the full script is doing something much more interesting
>     (rolling up in-paralogs with attempts at alternative rootings),
>     but this is my attempt to cut out all of the other things I might
>     have done wrong :^>
>
>     The loop is crashing the first time I try it. Indeed, the
>     following variation fails as well:
>
>
>     import io
>     import sys
>     from Bio import Phylo
>
>     infile = 'Example1.tre'
>     trees = Phylo.parse(infile,'newick')
>
>     for tree in trees:
>         leafList = tree.get_terminals()
>         tree.root_with_outgroup(leafList[0])
>
>     ----
>
>     again, the same code with internals rather than terminals works fine.
>
>     Best wishes,
>     Rob
>



More information about the Biopython mailing list