[EMBOSS] Antwort: trimming mRNA to coding region, enroute to dicodon analysis

david.bauer at bayer.com david.bauer at bayer.com
Wed Mar 16 08:13:25 UTC 2011


Hi,

I already sometimes thought it would be nice to have an option for getorf 
which would return just the longest ORF it finds. 
Here is what I use to get the coding part from an mRNA sequence if there 
is no feature table information:

> getorf test.fa -find 1 -norev | infoseq -only -len -desc -filter | sort 
-nr | head -1 | tr -d "[]" | awk '{print "seqret test.fa -sb "$2" -send 
"$4}'

HTH,
David.

emboss-bounces at lists.open-bio.org schrieb am 15/03/2011 18:27:00:

> Starting with a large mRNA  fasta repository,
> what is the route to generating a derivative with the untranslated
> leader and post-stop codon segments trimmed off the mRNAs
> 
> This is enroute to a dicodon usage analysis, already written as a BASH
> script calling PERL modules.
> If anyone is interested in such applications, let me know.
> pdc (parse dicodons) works fine, taking about a second per pre-trimmed
> mRNA as it retreives from a FASTA repository.
> But before providing it to a Novice community, some fool proofing 
> best be done.
> 
> Marvin.Stodolsky at gmail.com
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss



More information about the EMBOSS mailing list