[EMBOSS] getorf includes unspecified amino acids as part of the ORF sequence

Peter Rice pmr at ebi.ac.uk
Wed Jan 13 17:36:08 UTC 2010


Hi Avi,

> I made a mistake and took a repeat masked contigs instead of the
> original contigs, and they indeed had Ns. Sorry for the mess (still,
> I am looking for an option where Ns are not be included in the ORF).

Just too late for the next EMBOSS release (in preparation), but a good 
suggestion for July.

We should look at adding options to all the translation programs for 
repeat-masked inputs. This probably means treating each unmasked (non N) 
region as a separate sequence with options to include an OREF running up 
to the Ns or to stop at the last stop codon, and the same for the start 
of an ORF. Similar to handling the start and end of the whole sequence.

Hope that will help

Peter



More information about the EMBOSS mailing list