[EMBOSS] transeq and ambiguous codons

Peter Rice pmr at ebi.ac.uk
Fri Jul 10 09:30:52 UTC 2009


Peter C. wrote:
> OK, leaving TRR aside for the moment (I'm not sure I'd have done it that
> way, but I think I follow your logic), I have some more problem cases for
> you to consider (all using the default standard NCBI table 1).
> 
> Most of these are 'unambiguous ambiguous codons' as you put it, and
> I would agree using X when a more specific letter is possible isn't ideal
> but isn't actually wrong. The "ATS" and related codons (see below)
> however are simply wrong.

They do look wrong. The "X when it could pick a residue" ones I knew of.

The others need a closer look. The plan is to work through all possible 
codons and all the NCBI genetic codes as soon as the release is out.

It should be a simple patch to ajtranslate.c when I'm done.

> --------------------------------------------------------------------------------------
> 
> Now for another debatable one, RAT means AAT or GAT which code
> for N and D. So, you could use B (Asx) here rather than the broader X.
> 
> Similarly, you don't use J to mean leucine (L) or to isoleucine (I), and
> opt for X (again, this is justifiable). e.g. WTA

Hmmm ... B and Z are ambiguity codes for amino acid analyser where all the 
amide bonds are broken and that includes N->D and Q->E. We used to have one 
of those in the lab. Similarly, J is for mass spec where I and L have the 
same molecular weight. I don't consider them appropriate for translation.

So I plan to go for unique amino acids where possible with the ambiguity codes.

What do our users think?

regards,

Peter



More information about the EMBOSS mailing list