[Bioperl-l] Re: [Biojava-l] BioInformatics toolbox.

Catherine Letondal letondal@pasteur.fr
Thu, 11 Apr 2002 22:30:29 +0200


Alex Rolfe wrote:
> 
> A number of people have pointed out that several GUI's exist for
> connecting components into pipelines (and I'll add my own- the
> biojava-lims code that I've been working on) and that the existing
> bio{perl,java} classes could probably be extended or wrapped to fit into
> these frameworks.  But I don't think that the combination would yield a
> viable system to let users create their own programs.

I completely agree with this statement but I would just add that, IMHO, the main reason 
for users (I mean by this for instance biologists not interested in computer programs) 
not to do this, is often not that it would be too "difficult", at least not to the point 
that it is really impossible, but rather that instead of having in mind to build a piece 
of software, they just want to do their work (for time reasons). 

So one possible idea to help them compose useful tools could be to use what they have 
actually done with the software. This can be just a macro recording feature or more 
complex techniques such as  "Programming by demonstration" or "Programming by examples" 
techniques. 
You can find a good description of these techniques in this book:
Your Wish is My Command:  Giving Users the Power to Instruct their Software
(http://lieber.www.media.mit.edu/people/lieber/Your-Wish/)
or : Watch What I Do, Programming By Demonstration
(http://lieber.www.media.mit.edu/people/lieber/PBE/)

It's the idea of "end-user programming" which (IMHO) means programming while 
using - rather then just so-called "easy programming".

> When you extend bio{perl.java} classes to get components for these
> GUI's, you'd end up with 2 types : data and actions.  Data components
> (like java beans) would be able to describe their properties.  Action
> components would need to describe the format/requirements for their
> inputs and outputs.  Action components would get their inputs from 
> - the outputs of other action components
> - user inputs
> - parameters you specify for the program.
> 
> The first problem is how to format the outputs of one action as inputs
> for another.  The bio*'s solve this by providing standard interfaces
> that everything uses.
> 
> The second, and I think harder, problem is that you end up with too many
> types of data objects running around and too many types of actions.
> 
> Consider a pipeline where you start with genbank accession numbers,
> fetch the sequences, blast the sequences against a local database, and
> do something with the sequences based on the output.  The first part is
> easy to specify.  Input is a list of strings, output is a Sequence
> object.  The sequence object goes to the blast component.  But having a
> GUI specify how to process the blast output is hard because there are
> lots of possibilities.  Trying to specify what should happen through a
> GUI seems like it would either be very confusing (eg a long list of
> options) or very limiting (a short list of options).  

PBE techniques enable in fact much more than composition: programming constructs
such as loops (be detecting repetitions) or conditionals, regular expressions
specification, text styles, graphics, web browsing, ... (see references above or
http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/garnet/www/pbd-group/papers/voop.ps.Z
or http://lieber.www.media.mit.edu/people/lieber/Your-Wish/13-Blackwell.pdf).

But here, I think that some scripting should also be possible for the user,
but *within the user interface*, enabling the user to enter small pieces of code 
where it is needed, and only there, in a way similar to what is done in spreadsheets or
in HyperCard. And languages like Python have been built with this in mind 
(http://www.python.org/doc/essays/cp4e.html).

Very few tools offer this kind of end-user programming features (or even just macros!), 
although a lot of research has been done on this domain, and although it really needed by
users.


> I think the best way to start on a toolbox that user's could use is to
> build a toolbox for programmer's the provides useful components and a
> GUI.  Hopefully you have to write less and less code as time goes by to
> the point where users could design their own process without any coding.
 
--
Catherine Letondal -- Pasteur Institute Computing Center