[Biopython] Phylo: rerooting a tree with a terminal node
Eric Talevich
eric.talevich at gmail.com
Fri Jan 14 23:50:58 EST 2011
Hi Rob,
This should work now:
https://github.com/biopython/biopython/commit/e9cfcc3680a5b5692f91e560ea08e51515c9c757
I also added another unit test based on your example -- looping through all
the nodes in a few contrived trees, rerooting at each node and testing that
the total tree length doesn't change. It passes now. So, thanks for the
test!
Cheers,
Eric
On Thu, Jan 13, 2011 at 11:48 AM, Robert Beiko <beiko at cs.dal.ca> wrote:
> Hi Eric,
>
> I applied the fix and many of the terminal rootings now work. Thanks!
>
> I am still getting errors on a smaller subset of trees, though. The
> simplest example is this one (in a file called Example1.tre):
>
> (A,B,(C,D));
>
> I have modified test.py to do things slightly differently:
>
> --------------------------
>
>
> import io
> import sys
> from Bio import Phylo
>
> REROOT_INTERNAL = 0
>
>
> infile = 'Example1.tre'
> trees = Phylo.parse(infile,'newick')
>
> for tree in trees:
> print "\nInitial tree:"
> Phylo.write(tree,sys.stdout,'newick')
>
> if REROOT_INTERNAL == 1:
> for iNode in tree.get_nonterminals():
> if len(tree.get_path(iNode)) > 1:
> tree.root_with_outgroup(iNode)
> break
>
> print "\nRerooted with nice internal node:"
> Phylo.write(tree,sys.stdout,'newick')
>
> leafList = tree.get_terminals()
> print "\nAttempting to root on terminal " + leafList[0].name
>
> tree.root_with_outgroup(leafList[0])
>
> print "Rerooted on terminal:"
> Phylo.write(tree,sys.stdout,'newick')
>
> -------------------------------
>
> [Apologies for all the print statements and C-like constants]
>
> If REROOT_INTERNAL is set to 0, then we go right to the 'leafList =
> tree.get_terminals()' line and get the following error:
>
> Initial tree:
> (A:1.00000,B:1.00000,(C:1.00000,D:1.00000)0.00000:1.00000)0.00000:1.00000;
>
> Attempting to root on terminal A
>
> Traceback (most recent call last):
> File "C:\Projects\10000\Phylogenomics\TestTrees\test.py", line 25, in
> <module>
>
> tree.root_with_outgroup(leafList[0])
> File "C:\Python26\lib\site-packages\Bio\Phylo\BaseTree.py", line 768, in
> root_with_outgroup
> parent = outgroup_path.pop(-2)
> IndexError: pop index out of range
>
> which I assume occurs either because the terminal node (A) has the root as
> its immediate parent, and/or because it's one leaf from an initial
> trifurcation. Note that ((A,B),(C,D)); works fine in this case.
>
> Setting REROOT_INTERNAL to 1 is my attempt to get around this problem by
> first rooting on a 'safe' internal node, and then rooting on the terminal.
> On many of the larger trees I am working with, this solves the problem. But
> in the case of the tree above, it seems that the original trifurcation
> remains in place. A few of the larger trees I have tested also retain this
> trifurcation, even if branches are moved around. Setting REROOT_INTERNAL to
> 1 gives me the following output:
>
> Initial tree:
> (A:1.00000,B:1.00000,(C:1.00000,D:1.00000)0.00000:1.00000)0.00000:1.00000;
>
> Rerooted with nice internal node:
> (A:1.00000,B:1.00000,(C:1.00000,D:1.00000)0.00000:1.00000)0.00000:1.00000;
>
> Attempting to root on terminal A
>
> Traceback (most recent call last):
> File "C:\Projects\10000\Phylogenomics\TestTrees\test.py", line 25, in
> <module>
>
> tree.root_with_outgroup(leafList[0])
> File "C:\Python26\lib\site-packages\Bio\Phylo\BaseTree.py", line 768, in
> root_with_outgroup
> parent = outgroup_path.pop(-2)
> IndexError: pop index out of range
>
> Now, if I change the line
>
> for iNode in tree.get_nonterminals():
>
> to
>
> for iNode in tree.get_terminals():
>
> Then I get the desired behaviour. This succeeds as a workaround as long as
> I check to make sure that I'm not rooting again on the terminal that I just
> rooted on.
>
> Best wishes,
> Rob
>
>
> On 12/01/2011 9:55 PM, Eric Talevich wrote:
>
> Hi Rob,
>
> This was an outright bug in Bio.Phylo, so thanks again for reporting it.
> I've pushed a fix to GitHub:
>
> https://github.com/biopython/biopython/commit/1a8a39b6d24a9a4b9088255327b0f2fd12c19a09
>
> For your own work, you can get this fix by either:
> (a) checking out a development copy of Biopython from GitHub (the master
> branch is fairly safe) and reinstalling, or
> (b) applying just this fix to your copy of Bio.Phylo in-place -- i.e.
> editing your existing Biopython installation. You can replace the file
> Bio/Phylo/BaseTree.py with the one from GitHub without any ill effects.
>
> Cheers,
> Eric
>
> On Wed, Jan 12, 2011 at 1:46 PM, Robert Beiko <beiko at cs.dal.ca> wrote:
>
>> Hi Eric,
>>
>> Thank you very much for your quick reply.
>>
>> Indeed the full script is doing something much more interesting (rolling
>> up in-paralogs with attempts at alternative rootings), but this is my
>> attempt to cut out all of the other things I might have done wrong :^>
>>
>> The loop is crashing the first time I try it. Indeed, the following
>> variation fails as well:
>>
>>
>> import io
>> import sys
>> from Bio import Phylo
>>
>> infile = 'Example1.tre'
>> trees = Phylo.parse(infile,'newick')
>>
>> for tree in trees:
>> leafList = tree.get_terminals()
>> tree.root_with_outgroup(leafList[0])
>>
>> ----
>>
>> again, the same code with internals rather than terminals works fine.
>>
>> Best wishes,
>> Rob
>>
>>
>
More information about the Biopython
mailing list