[EMBOSS] Coderet

Henrikki Almusa henrikki.almusa at helsinki.fi
Fri Jan 23 06:36:54 UTC 2004


On Friday 23 January 2004 07:42, Sean.Maceachern at dpi.vic.gov.au wrote:
> Hello,
>
> I am trying to use coderet to extract cds from some genbank flat files. I
> am running into a problem regarding the desriptor line in the output fasta
> files.
>
> eg)
>
> >nm_000367_cds_1
> ATGGATGGTACAAGAACTTCACTTGACATTGAAGAGTACTCGGATACTGAGGTACAGAAA
> AACCAAGTACTAACTCTGGAAGAATGGCAAGACAAGTGGGTGAACGGCAAGACTGCTTTT
>
> I was hoping someone would be able to tell me how I can change the
> descriptor line from the generic output above (nm_000367_cds_1) to include
> the GI : ID form the
> flat file? I also think it would be a good idea if the id could be followed
> by a definition line to make the output more closely resemble the output
> from NCBI.

You can change the sequence format with -osformat option (in all emboss 
programs which outputs sequences). Probably the right format is "ncbi". If it 
isn't read the page on emboss web site in User Documantation -> Sequence 
format. That will list all available formats.

Here to help, 
-- 
Henrikki Almusa



More information about the EMBOSS mailing list