[Bioperl-l] Tree Path
barry.m.dancis at gsk.com
barry.m.dancis at gsk.com
Fri Nov 18 12:47:36 EST 2005
Hi--
I would like to have a string to represent the node path of a leaf
on a phylogeny tree such that when the paths are sorted, they are arranged
in the same order as displayed by Forrester or tree view. The code below
produces this output:
leaf creation path
iiiiii, 14, 11:
gggggg, 11, 121:
hhhhhh, 12, 122:
aaaaaa, 0, 21111:
bbbbbb, 1, 21112:
cccccc, 3, 2112:
dddddd, 5, 212:
eeeeee, 7, 221:
ffffff, 8, 222:
for the tree:
(((((aaaaaa:0.03000,bbbbbb:0.06091):0.04740,cccccc:0.12143):0.23166,dddddd:0.36034):0.00914,(eeeeee:0.30561,ffffff:0.36105):0.01494):0.01961,((gggggg:0.30365,hhhhhh:0.33271):0.02358,iiiiii:0.32490):0.02788);
Notice that the order of the output is the same as the order in the dnd
file except that the last major branch of 3 nodes in shown first.
if I create the paths using the order sorted by creation id
@nodes = sort { $a->_creation_id <=> $b->_creation_id; }
$tree->each_Descendent;
I get:
leaf creation path
aaaaaa, 0, 11111:
bbbbbb, 1, 11112:
cccccc, 3, 1112:
dddddd, 5, 112:
eeeeee, 7, 121:
ffffff, 8, 122:
gggggg, 11, 211:
hhhhhh, 12, 212:
iiiiii, 14, 22:
which is now the same as the order in the dnd file.
Unfortunately, Forrester gives the order as:
bbbbbb, 1, 11112:
aaaaaa, 0, 11111:
cccccc, 3, 1112:
dddddd, 5, 112:
ffffff, 8, 122:
eeeeee, 7, 121:
hhhhhh, 12, 212:
gggggg, 11, 211:
iiiiii, 14, 22:
and tree view gives the order as:
dddddd, 5, 112:
cccccc, 3, 1112:
aaaaaa, 0, 11111:
bbbbbb, 1, 11112:
eeeeee, 7, 121:
ffffff, 8, 122:
iiiiii, 14, 22:
gggggg, 11, 211:
hhhhhh, 12, 212:
As expected, the differences in the orders only represent differences
caused by flipping the order of branches and not due to some fundamental
differences in the trees
For some other trees, treeview will give the same order as the path. When
there are differences between the path and the location in the displays,
it is difficult to find a leaf on a large tree diagram from the node path.
The following almost reproduces the Forrester order(igh branch appears at
the top instead of the bottom):
@nodes = $tree->each_Descendent; #unsorted
if ($nodes[0]->is_Leaf) {
@nodes = reverse @nodes;
}
My questions are:
How do I need to change my sorting so that the order of the path is the
same as the order on the display in Forrestor and/or Tree View.
Has anyone else done similar things? Are there bioperl routines to do
this?
Thanks,
Barry
==========================================================================================================
sub get_phylo_paths {
my ($treefile) = @_;
my $treeio = new Bio::TreeIO( -format => 'newick', -file => $treefile);
$tree = $treeio->next_tree; # get the tree
my $path = [];
get_phylo_path ($tree->get_root_node,$path);
} # end get_phylo_paths
sub get_phylo_path {
my ($tree, $ancestor_path) = @_;
my @nodes;
if ($tree->is_Leaf) {
#Include ':' at end so that path is treated as a string not a
number by Excel, Spotfire, etc
print $tree->id . ', ' . $tree->_creation_id . ', ' . join
($NODE_SEPARATOR,@$ancestor_path) . ":\n";
}
else {
@nodes = $tree->each_Descendent; #unsorted
my $i = 1;
foreach my $node (@nodes) {
my @path = @$ancestor_path;
push @path, $i++;#adds a number to the path node - either 1 or 2
except for the root node where there will be a 3 as well
get_phylo_path ($node, \@path);
}
}
}
More information about the Bioperl-l
mailing list