[Bioperl-l] Bio::Species, Bio::Taxonomy::Node overhaul

Sendu Bala bix at sendu.me.uk
Mon Aug 7 08:38:58 UTC 2006


Sendu Bala wrote:
> Hilmar Lapp wrote:
>> 1) It sounds a bit that you changed the behavior of get_lca() such  
>> that users may have to adjust their code? If this is true, then this  
>> needs to be made clear in the 1.6 release as that part will not be  
>> backward compatible. If this is not true, then why did you have to  
>> change the implementation of Bio::Tools::Phylo::PAML to make tests  
>> pass? I.e., to what extent can what broke Bio::Tools::Phylo::PAML  
>> also break someone's script?
> 
> I can say that it /should/ have given the same results, but clearly it 
> didn't. What I had to change in PAML was the way in which PAML found the 
> lca of multiple nodes; it had its own algorithm for that, that used 
> get_lca 2 nodes at a time. Now it just calls get_lca once, supplying all 
> the nodes in one go.
> 
> I don't think I spent any time trying to figure out the problem, I just 
> made the change:
> 
> < while( @nodes_L > 1 ) {
> <     my $lca = $tree->get_lca
> < 	(-nodes => [shift @nodes_L,
> < 		    shift @nodes_L]);
> <     push @nodes_L, $lca;
> < }
> < my $n = shift @nodes_L;
> ---
>  > my $n = @nodes_L < 2 ? shift(@nodes_L) : $tree->get_lca(@nodes_L);
> 
> I'll look into it and see if I can avoid any behaviour change.

Oh yes, I remember now. get_lca() used to consider an input node as a 
possible ancestor of itself, which is how the algorithm in PAML worked.

So there will be a behaviour change - now get_lca really does only get 
the lowest common ancestor of input nodes, which necessarily can't be 
any of the input nodes themselves.

I'd call the old behaviour a bug that has now been fixed. (Though the 
code had a comment to the effect that it was a quite deliberate choice 
on the part of the author.)


Ah, I just realised that the PAML algorithm is on the wiki, so many 
people may have use it:
http://www.bioperl.org/wiki/HOWTO:Trees#Bio::Tree::TreeFunctionsI
The old get_lca behaviour was probably there purely to allow this 
convergence to work. I'll have to edit that page along the lines of 'to 
get the lca of multiple nodes you used to have to do ..., but now you do 
...'. When would I make that edit? After I commit, or when 1.6 comes out?



More information about the Bioperl-l mailing list