[Bioperl-l] Tree Path

Jason Stajich jason.stajich at duke.edu
Sun Nov 20 14:04:04 EST 2005


I just added the ability to request the order sorted alphabetically  
(available from CVS).  So each_Descendent('alpha') or each_Descendent 
('revalpha'). You can now also supply -order_by in Bio::TreeIO so  
that the order can be specified when writing out.

I'm not sure we can guarantee the order is the same at the read order  
right now with the way the parser is coded.

Creation id is somewhat arbitrary as it has to do with the parse  
order not necessarily left-to-right reading of nodes.

So what I don't understand is what Forrester sort order is based on?   
If you can describe that you can write a sort function to achieve it,  
each_Descendent already accepts a code reference so you can pass in  
an arbitrary subroutine to do the sort.  You can always as TreeView  
to re-do the layout for you tree so that it is alphabetical or  
ladderized.   Is this what you mean?

-jason
On Nov 18, 2005, at 12:47 PM, barry.m.dancis at gsk.com wrote:

> Hi--
>
>         I would like to have a string to represent the node path of  
> a leaf
> on a phylogeny tree such that when the paths are sorted, they are  
> arranged
> in the same order as displayed by Forrester or tree view. The code  
> below
> produces this output:
>
> leaf          creation  path
> iiiiii,         14,     11:
> gggggg,         11,     121:
> hhhhhh,         12,     122:
> aaaaaa,         0,      21111:
> bbbbbb,         1,      21112:
> cccccc,         3,      2112:
> dddddd,         5,      212:
> eeeeee,         7,      221:
> ffffff,         8,      222:
>
> for the tree:
>
> (((((aaaaaa:0.03000,bbbbbb:0.06091):0.04740,cccccc:0.12143): 
> 0.23166,dddddd:0.36034):0.00914,(eeeeee:0.30561,ffffff:0.36105): 
> 0.01494):0.01961,((gggggg:0.30365,hhhhhh:0.33271):0.02358,iiiiii: 
> 0.32490):0.02788);
>
> Notice that the order of the output is the same as the order in the  
> dnd
> file except that the last major branch of 3 nodes in shown first.
>
> if I create the paths using the order sorted by creation id
>
>   @nodes = sort { $a->_creation_id <=> $b->_creation_id; }
> $tree->each_Descendent;
>
> I get:
>
> leaf          creation  path
> aaaaaa,         0,      11111:
> bbbbbb,         1,      11112:
> cccccc,         3,      1112:
> dddddd,         5,      112:
> eeeeee,         7,      121:
> ffffff,         8,      122:
> gggggg,         11,     211:
> hhhhhh,         12,     212:
> iiiiii,         14,     22:
>
> which is now the same as the order in the dnd file.
>
> Unfortunately, Forrester gives the order as:
>
> bbbbbb, 1, 11112:
> aaaaaa, 0, 11111:
> cccccc, 3, 1112:
> dddddd, 5, 112:
> ffffff, 8, 122:
> eeeeee, 7, 121:
> hhhhhh, 12, 212:
> gggggg, 11, 211:
> iiiiii, 14, 22:
>
> and tree view gives the order as:
>
> dddddd, 5, 112:
> cccccc, 3, 1112:
> aaaaaa, 0, 11111:
> bbbbbb, 1, 11112:
> eeeeee, 7, 121:
> ffffff, 8, 122:
> iiiiii, 14, 22:
> gggggg, 11, 211:
> hhhhhh, 12, 212:
>
>
> As expected, the differences in the orders only represent differences
> caused by flipping the order of branches and not due to some  
> fundamental
> differences in the trees
> For some other trees, treeview will give the same order as the  
> path. When
> there are differences between the path and the location in the  
> displays,
> it is difficult to find a leaf on a large tree diagram from the  
> node path.
>
> The following almost reproduces the Forrester order(igh branch  
> appears at
> the top instead of the bottom):
>
> @nodes = $tree->each_Descendent; #unsorted
>         if ($nodes[0]->is_Leaf) {
>           @nodes = reverse @nodes;
>         }
>
> My questions are:
> How do I need to change my sorting so that the order of the path is  
> the
> same as the order on the display in Forrestor and/or Tree View.
> Has anyone else done similar things? Are there bioperl routines to do
> this?
>
> Thanks,
>
> Barry
>
> ====================================================================== 
> ====================================
> sub get_phylo_paths {
>
>   my ($treefile) = @_;
>
>   my $treeio = new Bio::TreeIO( -format => 'newick', -file =>  
> $treefile);
>   $tree = $treeio->next_tree;             # get the tree
>   my $path = [];
>   get_phylo_path ($tree->get_root_node,$path);
> } # end get_phylo_paths
>
>
> sub get_phylo_path {
>   my ($tree, $ancestor_path) = @_;
>   my @nodes;
>   if ($tree->is_Leaf) {
>         #Include ':' at end so that path is treated as a string not a
> number by Excel, Spotfire, etc
>     print $tree->id . ', ' . $tree->_creation_id . ', ' . join
> ($NODE_SEPARATOR,@$ancestor_path) . ":\n";
>   }
>   else {
>     @nodes = $tree->each_Descendent; #unsorted
>     my $i = 1;
>     foreach my $node (@nodes) {
>         my @path = @$ancestor_path;
>         push @path, $i++;#adds a number to the path node - either 1  
> or 2
> except for the root node where there will be a 3 as well
>         get_phylo_path ($node, \@path);
>       }
>     }
> }
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12




More information about the Bioperl-l mailing list