[EMBOSS] getorf includes unspecified amino acids as part of the ORF sequence

Peter biopython at maubp.freeserve.co.uk
Mon Jan 11 15:53:02 UTC 2010


On Mon, Jan 11, 2010 at 2:26 PM, Fungazid <fungazid at yahoo.com> wrote:
>
> Hello people,
>
> I just installed emboss on linux ubuntu (using the ubuntu synaptic package manager). I am using the getorf program, and I see it gives me this kind of output lines:
>
>>00001_3 [803 - 1120]
> LARLRFVVLGNSFIASAKGWSTPYGPTTFGPFRSCIYPRVFRSTRVRKAMATRIGSNRVN
> ILIRCTXXXXXXXXXXXXXXXXXXXXXXXXXNPYLGWWCYIFCIFR
>
> I don't like the Xs as they represent unspecified amino acids. Is there an input parameter to tell the program to report only the regions before and after the Xs ?
>
> In addition (and maybe this is beyond the scope of this mailing list) what is the biological meaning of such Xs ?

What was the input sequence like? Was there a stretch of NNNNN perhaps?

Peter



More information about the EMBOSS mailing list