[Bioperl-l] simpleAlign percentage_identity
Jason Stajich
jason@cgt.mc.duke.edu
Fri, 15 Mar 2002 16:17:09 -0500 (EST)
I think that percentage identity was not being calculated correctly in
SimpleAlign.
It seems the calculation of the divisor is wrong - or we need to stop
double counting the total identical. I have committed a fix that seems in
line with what EMBOSS is reporting for overall percent identity (strangely
emboss water 2.2.1 and emboss water < 2.2.1 do not agree on percent
identity for the same data...). All tests pass so I'm committing the
change. Please have a look. It is a simpler solution I believe since we
can rely on the length of alignment to tell us how many possible sites to
compare (if we're including gaps in the percent id calculation).
I have also committed changes which should handle both the needle output
where it doesn't include leading/trailing gaps for global alignments and a
more correct percentage_identity calculation in SimpleAlign.
Marching on...
-j
--
Jason Stajich
Duke University
jason@cgt.mc.duke.edu