[Bioperl-l] obtain a distance matrix from tree

Thomas Sharpton thomas.sharpton at gmail.com
Tue Sep 27 20:02:44 UTC 2011


Hi Ross,

For very large trees, I found it to be more efficient to do this in R  
using the ape package. I have a script listed in my github repo that  
will convert a tree to a distance matrix via in R at the link below:

https://github.com/sharpton/PhylOTU/blob/master/tree_to_matrix.R

That said, I've also done this in Bioperl using something like the  
following:

use Bio::TreeIO;

my $treein = Bio::TreeIO->new( -fh => "input_tree.nwk", -format =>  
'newick' );
while( my $tree = $treein->next_tree ){
	my %dist_matrix = ();
	my @leaves = $tree->get_leaf_nodes;
	foreach my $leaf1( @leaves ){
		my $id1 = $leaf1->id;	
		foreach my $leaf2( @leaves ){
			my $id2 = $leaf2->id;
			next if $id1 eq $id2;
			next if( defined( $dist_matrix{$id1}->{$id2} ) || defined  
( $dist_matrix{$id2}->{$id1} ) );
			my $distance = $tree->distance( -nodes => [$leaf1, $leaf2] );
			$dist_matrix{$id1}->{$id2} = $distance;
		}
	}
}
#print distance matrix here....

This will put the information you need to create either a full or a  
upper triangle distance matrix into the hash %dist_matrix. I didn't  
test the above, so hopefully there are no bugs....

Someone else may have a more elegant solution.

Best,
Tom

PS: Sorry if you get this twice.

On Sep 27, 2011, at 7:16 AM, Ross KK Leung wrote:

> After using MEGA to generate a newick tree file (phylogram), I  
> wonder if
> Bioperl has any convenient functions to derive the (n x n) distance  
> (by NJ,
> MP etc) matrix. Thanks for your advice in advance!
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list