Genetic codes and other repeated ACD lists
Dr J.C. Ison
jison at hgmp.mrc.ac.uk
Fri Apr 8 10:34:51 UTC 2005
Hi Peter
Comments below.
Cheers
Jon
Peter Rice wrote:
>
> I have found a way to save writing and maintaining lists like these in ACD files:
>
> list: table [
> additional: "Y"
> default: "0"
> minimum: "1"
> maximum: "1"
> header: "Genetic codes"
> values: "0:Standard; 1:Standard (with alternative initiation
> codons); 2:Vertebrate Mitochondrial; 3:Yeast Mitochondrial;
> 4:Mold, Protozoan, Coelenterate Mitochondrial and
> Mycoplasma/Spiroplasma; 5:Invertebrate Mitochondrial; 6:Ciliate
> Macronuclear and Dasycladacean; 9:Echinoderm Mitochondrial;
> 10:Euplotid Nuclear; 11:Bacterial; 12:Alternative Yeast Nuclear;
> 13:Ascidian Mitochondrial; 14:Flatworm Mitochondrial;
> 15:Blepharisma Macronuclear; 16:Chlorophycean Mitochondrial;
> 21:Trematode Mitochondrial; 22:Scenedesmus obliquus;
> 23:Thraustochytrium Mitochondrial"
> delimiter: ";"
> codedelimiter: ":"
> information: "Code to use"
> knowntype: "genetic code"
> ]
>
> Using the "knowntype" attribute it is possible to delet the value atttribute,
> and to define a standard list using a "resource" definition in the
> emboss.default (or .embossrc) file like this:
>
> RESOURCE genetic_code [ type: "list" value: "0:Standard;11:Bacterial" ]
>
> (for just 2 genetic codes)
>
> or
>
> RESOURCE genetic_code [ type: "list" value: "@EGC.index" ]
>
> (for a list of all the genetic codes - this will read a datafile EGC.index
> which is new in CVS).
>
> Other resource definitions could be commands to execute.
It'd be cleaner, more flexible and and easier to maintain and if not a
requirement now probably an increasing one in the future. I've two progs
that would benefit from it now.
> I have not yet decided whether to allow a value of "@EGC.index" in the ACD
> file itself. It could be a nice short cut, but I like using a "knowntype" to
> control the results.
Could be confusing to allow that in the ACD file because the punter might
think EGC existed, e.g. as a data item, in the file itself and get confused
when they can't find it.
> There are some problems to solve:
>
> 1. the resource is tested in too many places - it should replace the "value"
> attribute when it is first used. Not hard to do.
>
> 2. there should be a clean way to define a default value for each knowntype -
> for example calling an ajTrn function to resolve the "genetic code" knowntype
> to a value. Functions can be defined for list knowntypes in ajacd.c
Couldn't the default be specified in the same place / file as the values themselves?
Presumably the default value would be needed before run-time proper and could
be retrieved at the same time as the values are.
>
> 3. anyone parsing the ACD file will wonder where the value has gone - perhaps
> acdpretty can be made to fill in missing values with an environment variable
> set. Would that be acceptable to those who need it?
I think it would be nice to support both "standard" lists (ie. ones *with* "values"
attribute) and the new style. Perhaps something like:
values: "@knowntype"
to indicate to use the knowntype to get the values, *or*
values: "0: Standard ... etc" as before.
Then the values attribute would always be there, with the ACD developer having
the option to specify a standard list of values or to get the values from the
knowntype.
> Future uses for this:
>
> 1. standard list of genetic codes with descriptions
>
> 2. standard reading frame names
>
> 3. list of known codon usage files, matrices, etc. by specifying "?" as the value
>
> 4. a list of blast databases for a blastall wrapper :-)
>
> 5. replacing "string" qualifiers which have a knowntype with a selection that
> can display and test the list of acceptable values in ACD, to avoid a run-time
> failure
>
> Comments please ....
>
> Peter
--
Jon C. Ison, PhD
Proteomics Applications Group
MRC Rosalind Franklin Centre for Genomics Research
Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SB, UK
Tel: +44 1223 494500 Fax: +44 1223 494512
E-mail: jison at rfcgr.mrc.ac.uk Web: http://www.rfcgr.mrc.ac.uk
More information about the emboss-dev
mailing list