[EMBOSS] antigenic + valid ambiguity (aa residue) codes (BUG?)

Fernan Aguero fernan at iib.unsam.edu.ar
Mon Jun 4 14:42:44 UTC 2007


Hi!

we're running antigenic on a number of sequences
which contain some ambiguous residues. 

It seems like antigenic doesn't like the '*', 'B', 'U', 'Z' and
'X' characters in protein sequences.

This is weird because then we're left out of choices to
represent 'unknown' residues. 'X' is pretty standard to mean
'any aminoacid', while 'B' and 'Z' are used as ambiguity
codes by some programs to mean (glutamate/glutamine,
aspartate/asparragine).

It's also weird because antigenic silently takes in
a sequence in which we replaced one aminoacid within an
antigenic epitope with an 'O' (a non-existent aminoacid
code). But it strips it off the sequence, shortening the
length of the sequence and thus shifting all epitope
positions downstream.

It's also weird because when we replace the 'O' for another
non-existing aminoacid code ('J') antigenic chokes:
'Sequence is not a protein'.

Does this happen with other programs that use protein
sequences as input?

I guess this is a bug ... the behaviour should be consistent
and either take all valid aminoacid codes or none (and
leaving space for 'X').

Thanks in advance,

Fernan



More information about the EMBOSS mailing list