fasta and ACD not always required parameters
gbottu at ben.vub.ac.be
gbottu at ben.vub.ac.be
Thu Aug 29 13:21:58 UTC 2002
> The data type is seqall. This will need a value. No value (empty default)
> means it tries to read nothing.
> It is a seqall (or sequence or seqset) that needs a value, the 'required'
> part is not important ... except of course if it is 'required' the user
> will be prompted.
Yes, you're right. The parameter "required" has nothing to do with it. The
problem is that there are cases when where you sometimes need to read in a
sequence and sometimes not, depending on the circumstances, and that there is
obviously no good default here. The parameter "nullok" works for "infile" and
"outfile". Maybe a good idea is to make "nullok" (or "missing" or whatever) work
for any data type.
> The big problem here is trying to fit all the FASTA programs into one ACD
> instead of using 9 files.
> I did consider some time back extending ACD syntax to cover launching
> external applications with an ACD interface. There is an outline syntax
> definition including additional validation for (for example) blast gap
> penalties. The tricky part is testing the input is valid where it is a
> strange database (blast for example) before launching the application.
> Can anyone help with defining requirements for blast/fasta/etc.
Making all fastA or BLAST options work in one rather than in 9 resp. 5 is not a
problem, since I did it well. The problem is with the input of the search set.
Since I did not try to "embossify" the original program but only wrote a wrapper
application in EMBOSS, I had to consider 3 cases :
1) standard search set : the program uses an existing databank installed by BEN
(in some cases a fastA "library" file or a BLAST .nal/.pal file is used)
2) user defined search set : the wrapper reads in a "seqall" and generates in
/tmp a temporary databank in fastA or BLAST format. This could in principle be
used always, but you understand we wouldn't like to do it for embl:*
3) user provided databank in fastA or BLAST format : used directly by the
program. The wrapper does some testing before launching the program. For fastA
the databank is read in as ftp::xxx and then typed with
ajSeqTypeNuc/ajSeqTypeProt, for BLAST very crudely just the existence of three
files with appropriate extension is tested.
I admit that all this is tinkering rather than an elegant solution. For an in
depth solution the programs themselves should be changed and/or several new
features added to EMBOSS. In the meantime, just extending the "nullok" parameter
to all data types should make me happy.
More information about the emboss-dev