[Bioperl-l] simpleAlign percentage_identity

Jason Stajich jason@cgt.mc.duke.edu
Fri, 15 Mar 2002 16:17:09 -0500 (EST)


I think that percentage identity was not being calculated correctly in
SimpleAlign.

It seems the calculation of the divisor is wrong - or we need to stop
double counting the total identical.  I have committed a fix that seems in
line with what EMBOSS is reporting for overall percent identity (strangely
emboss water 2.2.1 and emboss water < 2.2.1 do not agree on percent
identity for the same data...).  All tests pass so I'm committing the
change.  Please have a look.  It is a simpler solution I believe since we
can rely on the length of alignment to tell us how many possible sites to
compare (if we're including gaps in the percent id calculation).

I have also committed changes which should handle both the needle output
where it doesn't include leading/trailing gaps for global alignments and a
more correct percentage_identity calculation in SimpleAlign.

Marching on...

-j
-- 
Jason Stajich
Duke University
jason@cgt.mc.duke.edu