[EMBOSS] getorf includes unspecified amino acids as part of the ORF sequence
Peter Rice
pmr at ebi.ac.uk
Wed Jan 13 17:36:08 UTC 2010
Hi Avi,
> I made a mistake and took a repeat masked contigs instead of the
> original contigs, and they indeed had Ns. Sorry for the mess (still,
> I am looking for an option where Ns are not be included in the ORF).
Just too late for the next EMBOSS release (in preparation), but a good
suggestion for July.
We should look at adding options to all the translation programs for
repeat-masked inputs. This probably means treating each unmasked (non N)
region as a separate sequence with options to include an OREF running up
to the Ns or to stop at the last stop codon, and the same for the start
of an ORF. Similar to handling the start and end of the whole sequence.
Hope that will help
Peter
More information about the EMBOSS
mailing list