Genetic codes and other repeated ACD
pmr at ebi.ac.uk
Wed Apr 13 14:30:05 UTC 2005
Guy Bottu wrote:
> - Currently emboss.defaults does not contain items that are absolutely
> needed. We think it is better not to change that philosophy by putting
> e.g. the geneticcodes in it. It could however be an idea to put in
> emboss.defaults a list of databanks in BLAST format, for the sake of BLAST
They will not be absolutely needed. There will be a default - a list of
values, a file with a list of values, or a script that finds everything.
> - For items like reading frames and maybe geneticcodes, that appear over
> and over again in several ACD files, yet are not user or installation
> customizable, the best proposal among those made in this discussion list
> seems to me to have it defined in one central file, for the purpose of the
> software developement, but to "acdpretty" it into the ACD files before
> they are distributed, for the sake of GUI functioning.
This will be the default ... but the distributed files will *not* have the
values filled in (if we fill the values in, the automatic list will not work
when users add new options :-).
You will need to run acdpretty yourself. That way, if you add extra options
locally you will get them in the acdpretty file. There is nothing to stop you
copying that file on top of the original acd file.
> - There is the case of items where users can choose to use their own data
> instead of the EMBOSS distribution data, like symbol comparison matrices
> and codon usage tables (would genetic codes fall into this catagory ?).
> Till now there was each time a new ACD object type defined, like matrix
> and cfile. Is shifting to the use of "knowntype" a good idea ? I do not
> know, but, let's keep consistent.
The same will happen for these ... but matrix files are complicated. For
programs that read nucleotide and protein, the list will have to include all
> - There is the issue of the program embossdata, useful for the advanced
> user and a possible tool for displaying choice lists in GUI's. Currently,
> when we run it at the BEN site with just the parameter -showall it produces a
> monstruous long list, because all the databanks (including CUTG) have been
> downloaded and "extracted". Maybe let it by default display only the data
> files in the main data directory ? Note that e.g. the list of PRINTS files
> is anyway not very interesting, since you cannot do anything with them as
> such. Could it be modified so that you can easily get a list of the
> alternative data files used by a particular program (or could a library
> routine called by the program itself do that) ?
I have modified embossdata to prompt always for a filename (default of no file
still lists all files).
Options to select the other directories are interesting because (1) you get
less output and (2) we will have a new internal default for the list of
directories used by embossdata!
Hope that makes things clearer, and thanks for the comments.
More information about the emboss-dev