fasta and ACD not always required parameters

gbottu at ben.vub.ac.be gbottu at ben.vub.ac.be
Thu Aug 29 13:21:58 UTC 2002


> The data type is seqall. This will need a value. No value (empty default) 
> means it tries to read nothing.
> 
> It is a seqall (or sequence or seqset) that needs a value, the 'required' 
> part is not important ... except of course if it is 'required' the user 
> will be prompted.

Yes, you're right. The parameter "required" has nothing to do with it. The 
problem is that there are cases when where you sometimes need to read in a 
sequence and sometimes not, depending on the circumstances, and that there is 
obviously no good default here. The parameter "nullok" works for "infile" and 
"outfile". Maybe a good idea is to make "nullok" (or "missing" or whatever) work 
for any data type. 
 
> The big problem here is trying to fit all the FASTA programs into one ACD 
file,
> instead of using 9 files.
> 
> I did consider some time back extending ACD syntax to cover launching 
> external applications with an ACD interface. There is an outline syntax 
> definition including additional validation for (for example) blast gap 
> penalties. The tricky part is testing the input is valid where it is a 
> strange database (blast for example) before launching the application.
> 
> Can anyone help with defining requirements for blast/fasta/etc.

Making all fastA or BLAST options work in one rather than in 9 resp. 5 is not a 
problem, since I did it well. The problem is with the input of the search set. 
Since I did not try to "embossify" the original program but only wrote a wrapper 
application in EMBOSS, I had to consider 3 cases :
1) standard search set :  the program uses an existing databank installed by BEN 
(in some cases a fastA "library" file or a BLAST .nal/.pal file is used)
2) user defined search set :  the wrapper reads in a "seqall" and generates in 
/tmp a temporary databank in fastA or BLAST format. This could in principle be 
used always, but you understand we wouldn't like to do it for embl:*
3) user provided databank in fastA or BLAST format : used directly by the 
program. The wrapper does some testing before launching the program. For fastA 
the databank is read in as ftp::xxx and then typed with 
ajSeqTypeNuc/ajSeqTypeProt, for BLAST very crudely just the existence of three 
files with appropriate extension is tested.

I admit that all this is tinkering rather than an elegant solution. For an in 
depth solution the programs themselves should be changed and/or several new 
features added to EMBOSS. In the meantime, just extending the "nullok" parameter 
to all data types should make me happy.

	Sincerely,
	Guy Bottu





More information about the emboss-dev mailing list