[Biopython] Phylo: rerooting a tree with a terminal node
Robert Beiko
beiko at cs.dal.ca
Thu Jan 13 16:48:32 UTC 2011
Hi Eric,
I applied the fix and many of the terminal rootings now work. Thanks!
I am still getting errors on a smaller subset of trees, though. The
simplest example is this one (in a file called Example1.tre):
(A,B,(C,D));
I have modified test.py to do things slightly differently:
--------------------------
import io
import sys
from Bio import Phylo
REROOT_INTERNAL = 0
infile = 'Example1.tre'
trees = Phylo.parse(infile,'newick')
for tree in trees:
print "\nInitial tree:"
Phylo.write(tree,sys.stdout,'newick')
if REROOT_INTERNAL == 1:
for iNode in tree.get_nonterminals():
if len(tree.get_path(iNode)) > 1:
tree.root_with_outgroup(iNode)
break
print "\nRerooted with nice internal node:"
Phylo.write(tree,sys.stdout,'newick')
leafList = tree.get_terminals()
print "\nAttempting to root on terminal " + leafList[0].name
tree.root_with_outgroup(leafList[0])
print "Rerooted on terminal:"
Phylo.write(tree,sys.stdout,'newick')
-------------------------------
[Apologies for all the print statements and C-like constants]
If REROOT_INTERNAL is set to 0, then we go right to the 'leafList =
tree.get_terminals()' line and get the following error:
Initial tree:
(A:1.00000,B:1.00000,(C:1.00000,D:1.00000)0.00000:1.00000)0.00000:1.00000;
Attempting to root on terminal A
Traceback (most recent call last):
File "C:\Projects\10000\Phylogenomics\TestTrees\test.py", line 25, in
<module>
tree.root_with_outgroup(leafList[0])
File "C:\Python26\lib\site-packages\Bio\Phylo\BaseTree.py", line 768,
in root_with_outgroup
parent = outgroup_path.pop(-2)
IndexError: pop index out of range
which I assume occurs either because the terminal node (A) has the root
as its immediate parent, and/or because it's one leaf from an initial
trifurcation. Note that ((A,B),(C,D)); works fine in this case.
Setting REROOT_INTERNAL to 1 is my attempt to get around this problem by
first rooting on a 'safe' internal node, and then rooting on the
terminal. On many of the larger trees I am working with, this solves the
problem. But in the case of the tree above, it seems that the original
trifurcation remains in place. A few of the larger trees I have tested
also retain this trifurcation, even if branches are moved around.
Setting REROOT_INTERNAL to 1 gives me the following output:
Initial tree:
(A:1.00000,B:1.00000,(C:1.00000,D:1.00000)0.00000:1.00000)0.00000:1.00000;
Rerooted with nice internal node:
(A:1.00000,B:1.00000,(C:1.00000,D:1.00000)0.00000:1.00000)0.00000:1.00000;
Attempting to root on terminal A
Traceback (most recent call last):
File "C:\Projects\10000\Phylogenomics\TestTrees\test.py", line 25, in
<module>
tree.root_with_outgroup(leafList[0])
File "C:\Python26\lib\site-packages\Bio\Phylo\BaseTree.py", line 768,
in root_with_outgroup
parent = outgroup_path.pop(-2)
IndexError: pop index out of range
Now, if I change the line
for iNode in tree.get_nonterminals():
to
for iNode in tree.get_terminals():
Then I get the desired behaviour. This succeeds as a workaround as long
as I check to make sure that I'm not rooting again on the terminal that
I just rooted on.
Best wishes,
Rob
On 12/01/2011 9:55 PM, Eric Talevich wrote:
> Hi Rob,
>
> This was an outright bug in Bio.Phylo, so thanks again for reporting
> it. I've pushed a fix to GitHub:
> https://github.com/biopython/biopython/commit/1a8a39b6d24a9a4b9088255327b0f2fd12c19a09
>
> For your own work, you can get this fix by either:
> (a) checking out a development copy of Biopython from GitHub (the
> master branch is fairly safe) and reinstalling, or
> (b) applying just this fix to your copy of Bio.Phylo in-place -- i.e.
> editing your existing Biopython installation. You can replace the file
> Bio/Phylo/BaseTree.py with the one from GitHub without any ill effects.
>
> Cheers,
> Eric
>
> On Wed, Jan 12, 2011 at 1:46 PM, Robert Beiko <beiko at cs.dal.ca
> <mailto:beiko at cs.dal.ca>> wrote:
>
> Hi Eric,
>
> Thank you very much for your quick reply.
>
> Indeed the full script is doing something much more interesting
> (rolling up in-paralogs with attempts at alternative rootings),
> but this is my attempt to cut out all of the other things I might
> have done wrong :^>
>
> The loop is crashing the first time I try it. Indeed, the
> following variation fails as well:
>
>
> import io
> import sys
> from Bio import Phylo
>
> infile = 'Example1.tre'
> trees = Phylo.parse(infile,'newick')
>
> for tree in trees:
> leafList = tree.get_terminals()
> tree.root_with_outgroup(leafList[0])
>
> ----
>
> again, the same code with internals rather than terminals works fine.
>
> Best wishes,
> Rob
>
More information about the Biopython
mailing list