Genetic codes and other repeated ACD

Peter Rice pmr at ebi.ac.uk
Wed Apr 13 14:30:05 UTC 2005


Guy Bottu wrote:

> - Currently emboss.defaults does not contain items that are absolutely 
> needed. We think it is better not to change that philosophy by putting 
> e.g. the geneticcodes in it. It could however be an idea to put in 
> emboss.defaults a list of databanks in BLAST format, for the sake of BLAST 
> wrappers.

They will not be absolutely needed. There will be a default - a list of 
values, a file with a list of values, or a script that finds everything.

> - For items like reading frames and maybe geneticcodes, that appear over 
> and over again in several ACD files, yet are not user or installation 
> customizable, the best proposal among those made in this discussion list 
> seems to me to have it defined in one central file, for the purpose of the 
> software developement, but to "acdpretty" it into the ACD files before 
> they are distributed, for the sake of GUI functioning.

This will be the default ... but the distributed files will *not* have the 
values filled in (if we fill the values in, the automatic list will not work 
when users add new options :-).

You will need to run acdpretty yourself. That way, if you add extra options 
locally you will get them in the acdpretty file. There is nothing to stop you 
copying that file on top of the original acd file.

> - There is the case of items where users can choose to use their own data 
> instead of the EMBOSS distribution data, like symbol comparison matrices 
> and codon usage tables (would genetic codes fall into this catagory ?). 
> Till now there was each time a new ACD object type defined, like matrix 
> and cfile. Is shifting to the use of "knowntype" a good idea ? I do not 
> know, but, let's keep consistent.

The same will happen for these ... but matrix files are complicated. For 
programs that read nucleotide and protein, the list will have to include all 
matrix files.

> - There is the issue of the program embossdata, useful for the advanced 
> user and a possible tool for displaying choice lists in GUI's. Currently, 
> when we run it at the BEN site with just the parameter -showall it produces a 
> monstruous long list, because all the databanks (including CUTG) have been 
> downloaded and "extracted". Maybe let it by default display only the data 
> files in the main data directory ? Note that e.g. the list of PRINTS files 
> is anyway not very interesting, since you cannot do anything with them as 
> such. Could it be modified so that you can easily get a list of the 
> alternative data files used by a particular program (or could a library 
> routine called by the program itself do that) ?

I have modified embossdata to prompt always for a filename (default of no file 
still lists all files).

Options to select the other directories are interesting because (1) you get 
less output and (2) we will have a new internal default for the list of 
directories used by embossdata!

Hope that makes things clearer, and thanks for the comments.

Peter





More information about the emboss-dev mailing list