[Bioperl-l] conservative amino acid change?
Tue, 07 Aug 2001 09:37:56 -0400
Heikki Lehvaslaiho wrote:
> I do not expect there to be dozens of objects here so why not:
There are dozens of different pairwise matrices for amino acids,
and (literally) hundreds of different indices (i.e., non-pairwise
measures) of physicochemical properties. See the AAIndex database:
> How different can the matrixes be?
A matrix denoting pairwise similarity might take many forms, but PAM
and BLOSUM are specifically log-odds matrices scaled so as to make
them useful as match-scores in popular sequence alignment algorithms.
Matrices with the same information but the wrong scale could not be
used with these alignment programs, but they might be useful
whenever one desires arbitrarily-scaled weights or penalties for
similarity or difference.
> Do we need a class for every type
> of similarity matrix or is it enough to have one class
> (Bio::Matrix::Similarity) with an attibute/method format(PAM|BLOSUM)
> to tell how the values were generated? Go for separete classes only of
> there will be methods in one which are not relevant in the other.
As Aaron Mackey suggested, some matrices are instantaneous rate
matrices. One might wish to have different methods for these than
for the log-odds scoring matrices.
But some methods could be general-- any pairwise amino acid matrix
might be used as a matrix of arbitrarily-scaled similarity or difference.
The only thing one would need to know is whether the value is higher
for similar amino acids (thus a similarity matrix) or lower (thus
a difference or distance matrix).
I'm afraid that for anything more complicated, you might have to
make the matrices carry their own methods, e.g., a matrix of
differences would include a method (e.g., S_ij = 1 - D_ij)
to compute a similarity matrix. But converting a given matrix
into a form that is optimized for use as alignment match-scores
is apparently something of a black art. What do you foresee as
the most common applications of pairwise matrices of similarity,
difference, rates, weights, and so on?