[EMBOSS] Codon usage file improvements

Peter Rice pmr at ebi.ac.uk
Wed Mar 30 15:50:10 UTC 2005


A quick check before I make changes to the EMBOSS codon usage files.

EMBOSS includes a set of codon usage files in the emboss/data/CODONS directory 
which are installed for use. The cutgextract program allows administrators to 
install codon usage tables for all species using the cutg database.

The files provided with EMBOSS were from Mike Cherry's 1992 "codonusage" 
database and from a 1994 release of the "transterm" database.

The file names are usually in the format "Egss" where "g" is the first letter 
of the genus, and "ss" is the first 2 letters of the species. Like SwissProt, 
there are a few exceptions (Eyeast for example). The filenames came from these 
databases originally - they were not made up by them EMBOSS team (we only 
added the "E" at the start).

I am modifying cutgextract to include more information in the output files, 
and will update the files in this directory where possible, using data from 
the latest cutg.

Are there any files in emboss/data/CODONS that are particular favourites and 
need to be preserved?

Are there any non-standard .cut filenames in emboss/data/CODONS that will 
cause problems if they are removed?

(Administrators will need to remove the old filenames by hand when they 
install a new version of EMBOSS - we cannot easily remove them automatically).

I will also allow a choice of codon usage file formats - are there any special 
formats that are useful for exchange with other packages?

regards,

Peter Rice





More information about the EMBOSS mailing list