Genetic codes and other repeated ACD lists
Peter Rice
pmr at ebi.ac.uk
Thu Apr 7 16:44:14 UTC 2005
I have found a way to save writing and maintaining lists like these in ACD files:
list: table [
additional: "Y"
default: "0"
minimum: "1"
maximum: "1"
header: "Genetic codes"
values: "0:Standard; 1:Standard (with alternative initiation
codons); 2:Vertebrate Mitochondrial; 3:Yeast Mitochondrial;
4:Mold, Protozoan, Coelenterate Mitochondrial and
Mycoplasma/Spiroplasma; 5:Invertebrate Mitochondrial; 6:Ciliate
Macronuclear and Dasycladacean; 9:Echinoderm Mitochondrial;
10:Euplotid Nuclear; 11:Bacterial; 12:Alternative Yeast Nuclear;
13:Ascidian Mitochondrial; 14:Flatworm Mitochondrial;
15:Blepharisma Macronuclear; 16:Chlorophycean Mitochondrial;
21:Trematode Mitochondrial; 22:Scenedesmus obliquus;
23:Thraustochytrium Mitochondrial"
delimiter: ";"
codedelimiter: ":"
information: "Code to use"
knowntype: "genetic code"
]
Using the "knowntype" attribute it is possible to delet the value atttribute,
and to define a standard list using a "resource" definition in the
emboss.default (or .embossrc) file like this:
RESOURCE genetic_code [ type: "list" value: "0:Standard;11:Bacterial" ]
(for just 2 genetic codes)
or
RESOURCE genetic_code [ type: "list" value: "@EGC.index" ]
(for a list of all the genetic codes - this will read a datafile EGC.index
which is new in CVS).
Other resource definitions could be commands to execute.
I have not yet decided whether to allow a value of "@EGC.index" in the ACD
file itself. It could be a nice short cut, but I like using a "knowntype" to
control the results.
There are some problems to solve:
1. the resource is tested in too many places - it should replace the "value"
attribute when it is first used. Not hard to do.
2. there should be a clean way to define a default value for each knowntype -
for example calling an ajTrn function to resolve the "genetic code" knowntype
to a value. Functions can be defined for list knowntypes in ajacd.c
3. anyone parsing the ACD file will wonder where the value has gone - perhaps
acdpretty can be made to fill in missing values with an environment variable
set. Would that be acceptable to those who need it?
Future uses for this:
1. standard list of genetic codes with descriptions
2. standard reading frame names
3. list of known codon usage files, matrices, etc. by specifying "?" as the value
4. a list of blast databases for a blastall wrapper :-)
5. replacing "string" qualifiers which have a knowntype with a selection that
can display and test the list of acceptable values in ACD, to avoid a run-time
failure
Comments please ....
Peter
More information about the emboss-dev
mailing list