[EMBOSS] getorf includes unspecified amino acids as part of the ORF sequence
Peter Rice
pmr at ebi.ac.uk
Tue Jan 12 14:15:28 UTC 2010
Hi Avi,
> The input is a simple fasta file with only A,C,T,G letters and
> nothing else, so I wouldn't expect any Xs. In addition, even if there
> would be Ns (and there are no Ns) the program cannot know if such Ns
> do not include stopcodons so it should not consider them as part of an ORF.
>>> 00001_3 [803 - 1120]
>> LARLRFVVLGNSFIASAKGWSTPYGPTTFGPFRSCIYPRVFRSTRVRKAMATRIGSNRVN
>> ILIRCTXXXXXXXXXXXXXXXXXXXXXXXXXNPYLGWWCYIFCIFR
That suggests the Xs have all come from stop codons.
There are other possibilities, including a badly formatted input file
(perhaps two sequences and descriptions read as one).
We do need to see the input file to know where those Xs are from.
Peter Rice
More information about the EMBOSS
mailing list