EMBOSS program groups

James Bonfield jkb at mrc-lmb.cam.ac.uk
Fri Apr 27 16:52:44 UTC 2001


Hi all,

I'm looking at restructuring the program groups so that there are fewer
groups, or at least make some groups to be subgroups (cascading menus in my
interface).

There seems to be a huge amount of redundancy with the average program being
in 1.7 groups (ish ;-)). Some entire groups have huge overlap, such as Motifs
and Pattern matching. What are people's thoughts on this? I'm willing to
submit my changes back to the emboss team, but how many people are likely to
be effected by this? We need this ourselves as the menus are simply too long
at present, and it's virtually impossible to find anything.

On a different note, I see that there's also a huge redundancy in
programs. How should users choose between them? Eg, as a novice user of
emboss, how would I know to use stretcher over needle? And similarly for water 
vs matcher.

The documentation for needle implies that needle is for short sequences and
stretcher is for longer sequences. However stretcher uses the Myers and Miller 
algorithm, which claims to be a full needleman-wunsch alignment algorithm
anyway (but with linear-space memory requirements). Similarly matcher vs
needle - both claim to be rigorous algorithms, but matcher uses less memory.

It seems that so far the reason for redundancy is simply that multiple authors 
have submitted programs to do identical tasks, with different names. Would it
be treading on peoples toes too much to suggest that some of this redudancy
should be removed? In the cases outlined above, assuming the results really
are comparable, then it seems clear that the more memory hungry versions
should go.

James

-- 
James Bonfield (jkb at mrc-lmb.cam.ac.uk)   Tel: 01223 402499   Fax: 01223 213556
Medical Research Council - Laboratory of Molecular Biology,
Hills Road, Cambridge, CB2 2QH, England.
Also see Staden Package WWW site at http://www.mrc-lmb.cam.ac.uk/pubseq/




More information about the emboss-dev mailing list