[Bioperl-l] Bio::Species, Bio::Taxonomy::Node overhaul
Sendu Bala
bix at sendu.me.uk
Mon Aug 7 13:57:19 UTC 2006
Hilmar Lapp wrote:
>
> On Aug 7, 2006, at 4:38 AM, Sendu Bala wrote:
>
>> So there will be a behaviour change - now get_lca really does only get
>> the lowest common ancestor of input nodes, which necessarily can't be
>> any of the input nodes themselves.
>>
>> I'd call the old behaviour a bug that has now been fixed. (Though the
>> code had a comment to the effect that it was a quite deliberate choice
>> on the part of the author.)
>
> I inclined to call the new behavior a bug. Why would the lca between
> node A and node B not be defined, or be an ancestor node of A instead of
> A itself? [...]
>
> Or am I missing something?
Well, it's the 'lowest common ancestor', isn't it? How can the ancestor
of something be itself?
I'm interested that you think that the LCA has to be defined; the
original implementation makes the same assumption in its comments.
Consider two lineages:
A---B---C---D---E
X---Y---Z---D---F
The old implementation would not only expect that E and F have an lca,
but return the answer D, which is wrong. E and F do not have a common
ancestor; their direct ancestors just happen to have the same
descriptor. (In typical usage it was probably never wrong, since the
descriptors used were script-unique, and unchangeable without using an
internal method.)
Or more obviously:
A--B
X--C
B and C do not have an lca. I think there is the assumption that both
nodes being compared belong to the same properly constructed tree, but
you don't even need a Tree object to use get_lca().
my $lca = Bio::Tree::TreeFunctionsI->get_lca(@nodes);
(Not that I'd suggest anyone do that.)
> Likewise, why would the lca between a node A and its child
> node B not be A but instead an ancestor node of A?
I could certainly be wrong, I just can't find anything authoritative
that explicitly states the correct answer to that either way.
For example
http://66.102.9.104/search?q=cache:5cDyg4Um8GEJ:dept-info.labri.fr/~gavoille/article/AGKR02
Defines the nearest common ancestor (== lowest common ancestor) like:
Let T be a rooted tree. A node x ∈ T is an ancestor of a
node y ∈ T if the path from the root of T to y goes through
x. A node v ∈ T is a common ancestor of x and y if it is
an ancestor of both x and y. The nearest common ancestor,
nca, of two nodes x, y is the common ancestor of x and y
whose distance to x (and to y) is smaller than the distance
to x of any other common ancestor of x and y.
According to that, v cannot be x or y.
More information about the Bioperl-l
mailing list