[EMBOSS] nt-multi-fastA-file

Peter Rice pmr at ebi.ac.uk
Wed Apr 12 10:20:46 UTC 2006


Christiane Nerz wrote:
> Hi all,
> 
> I put the gb-file of an whole genome in Artemis.
> Is there a possibility to export a multi-FastA-file with the bases of 
> all ORFs? Example:
> 
>  >ORF_1
> ATGTGTTCGTT....
>  >ORF_2
> ATGTTCCCGACCA...
>  >ORF_3
> ATGCCGCAT...
> 
> I know how to get all bases, but only as one complete sequence.
> (That genome is not published yet, so there is no multi-Fasta-file at 
> ncbi or EMBL available)

Yes, the coderet program will do this.

Unfortunately coderet tries to return CDS, mRNA and translations all in 
one file (to be fixed for the next release). You can ask just for the 
CDS with a couple of extra command line options:

coderet -nomrna -notranslation

Give it the filename as input.
The output will be the coding sequences.

With -nocds instead of -notranslation you will get the protein sequences.

If you have any problems parsing the GenBank file let me know.

regards,

Peter Rice



More information about the EMBOSS mailing list