[Bioperl-l] Strange behavior of $node->height for scientific notation format branch length

Jason Stajich jason.stajich at gmail.com
Tue Apr 10 06:33:00 UTC 2012


It also looks like there is some code in calculating height that only processes numbers that are floating point - see line 64. I am not sure why this is in there, but I guess it was a protection from something that was failing in some other situation.

62: foreach my $subnode ( $self->each_Descendent ) { 
63:        my $bl = $subnode->branch_length;
64:        $bl = 1 unless (defined $bl && $bl =~ /^\-?\d+(\.\d+)?$/);
65:        my $s = $subnode->height + $bl;
 

you can work around this by first forcing all your branch lengths to floating point after you read the tree in:
for my $node ($tree->get_all_nodes ) 
  $node->branch_length(sprintf("%f",$node->branch_length);
}

We should think about how we might handle scientific notation branch lengths properly in the code in the future if someone wants to take this on.

Jason

> Hi all,
> 
> I have encountered a strange behavior while calculating the tree height at
> root node.
> 
> If the branch length of the tree was in scientific notation format, such as
> MrBayes created trees, it is unable to give correct results.
> 
> For example,
> 
> Tree 1:
> 
> (((A:0.02,B:0.025):0.12,C:0.071):0.34,D:0.6);
> 
> Tree 2:
> 
> (((A:2e-2,B:2.5e-2):1.2e-1,C:7.1e-2):3.4e-1,D:6e-1);
> 
> These two trees are identical besides the expression of branch length.
> 
> The Perl script:
> 
> # ============================================================
> 
> #!/usr/bin/perl
> 
> use 5.010;
> use strict;
> use warnings;
> 
> use Bio::TreeIO;
> 
> my $usage = << "EOS";
> Display branch lengths for leave nodes.
> Usage:
>  t_branchlen.pl <ftree> [<fmt>]
> Params:
>  <ftree>:  Tree file.
>  <fmt>:    Tree format. Optional. Default "newick".
> EOS
> 
> my ($ftre, $fmt) = @ARGV;
> 
> die $usage unless ( defined $ftre );
> 
> $fmt = 'newick' unless ( defined $fmt);
> 
> my $o_treei = Bio::TreeIO->new(
>    -file   => $ftre,
>    -format => $fmt,
> );
> 
> my $o_tree = $o_treei->next_tree;
> 
> my @o_leaves = $o_tree->get_leaf_nodes();
> 
> say join("\t", ("Node", "Branch Length", "Depth"));
> 
> for my $o_node ( @o_leaves ) {
>    say $o_node->id, "\t", $o_node->branch_length, "\t", $o_node->depth;
> }
> 
> my $o_root = $o_tree->get_root_node;
> 
> # say;
> 
> say "Root height:\t", $o_root->height;
> 
> exit 0;
> 
> # ============================================================
> 
> For tree 1, the output is:
> 
> Node    Branch Length    Depth
> A    0.02    0.48
> B    0.025    0.485
> C    0.071    0.411
> D    0.6    0.6
> *Root height:    0.6*
> 
> For tree 2,
> 
> Node    Branch Length    Depth
> A    2e-2    0.48
> B    2.5e-2    0.485
> C    7.1e-2    0.411
> D    6e-1    0.6
> *Root height:    3*
> 
> The interesting thing is, the node depth values are correct, but I have no
> idea how the root height calculated.
> 
> Are there any ideas to resolve this problem?
> 
> Thanks!
> 
> Haizhou
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org





More information about the Bioperl-l mailing list