[Bioperl-l] Re: Amino Acid Codes

Aaron J Mackey Aaron J. Mackey" <amackey@virginia.edu
Mon, 26 Aug 2002 09:17:31 -0400 (EDT)


Hi Guy,

It's a good question.  The IUPAC ambiguity symbols B and Z stem from the
days when Aspartate (D) and Asparagine (N) could not be distinguished
chemically during an Edman-degradation protein sequencing experiment.
Similarly for E and Q.  There are no equivalent codes for MS "isomers":

http://www.chem.qmul.ac.uk/iupac/AminoAcid/A2021.html

However, many programs that are oriented towards using sequence data from
MS experiments have built-in "awareness" of this issue, and thus you don't
have encode the "isomers", you just have to pick one randomly (examples of
these softwares include MS-BLAST, CIDentify, and our own FASTS sequence
similarity search programs).  Alternatively, pattern search algorithms
allow use of "regular expressions" such as ACD[IL]MLK to allow either Ile
or Leu in the fourth position.

I hope that helps,

-Aaron

On Mon, 26 Aug 2002, Guy Hulbert wrote:

> Hi.
>
> I found your email address on this page:
>   http://doc.bioperl.org/releases/
>     bioperl-1.0.2/Bio/Tools/IUPAC.html
> I'm interested in a minor detail on sequence alphabets
> and the bioperl lists do not appear to be the right
> forum for this question.
>
> I am trying to extract amino acid sequences from mass
> spectrometer data.  One difficulty is that Leucine and
> Isoleucine are ambiguous by mass.  Another is that Lysine
> and Glutamine are almost so.  I wonder if there is a
> standard alphabet that expresses this ambiguity.
>
> In Bio::Tools::UIPAC the hash %IUP has:
>    B => [qw(D N)]
>    Z => [qw(E Q)]
> I do not understand the reasons for these as I am not
> a biologist.  I could always check the references you
> list for the IUPAC-IUP AMINO ACID SYMBOLS above this
> table.
>
> What I will do is something equivalent to:
>    J => [qw(I L)]
>    O => [qw(K Q)]
> the question is whether there are already standard
> symbols which express this ambiguity.
>
> Sincerely,
>
> --GH.
> (Guy Hulbert).
>

-- 
 Aaron J Mackey
 Pearson Laboratory
 University of Virginia
 (434) 924-2821
 amackey@virginia.edu