[Biojava-l] library for running blast and formatdb

Patrick McConnell MCCon012@mc.duke.edu
Tue, 14 Jan 2003 08:32:36 -0500


What I have written provides two essential base classes: Program and
Parameters.  The Program class provides the functionality for launching a
program and capturing output.  I should also put in hooks for handling the
input and output as streams as an alternative to capturing it in memory.
The Parameters class builds command arguments based on the fields of the
extending class using reflection.  It provides some flexibility for
determining what the flags and delimitters look like.  There has been
discussion to change the implementation somewhat to use jakarta's CLI
library, and I think a hybrid of the two would be appropriate.

I have written Program and Parameters implementations for NCBI's blastall
and formatdb programs.  Now, after chatting with Jason Stajich here at
Duke, I am working on a flexible queueing system for Programs.  This code
isn't complete yet, though.

So, if everyone likes this framework for launching programs, I'd be glad to
donate it to BioJava.  If people don't like it, I'll change it based on
suggestions.  Whomever is interested, please check out:
http://www.dbsr.duke.edu/software/blast . My code is fully documented, and
I have added a couple examples that demonstrate the ease of launching
blast.

As to the XML description of program parameters, I think that is a good
idea, and can be a factory method in my Parameters class.  The method takes
in the XML somehow (File or Stream or whatever) and returns a Parameters
object.  But, I know that some people would prefer to handle the Parameters
internally with code instead of externally in a File.  So, we should not
limit ourselves to a single approach.

Thanks!

-Patrick





"Schreiber, Mark" <mark.schreiber@agresearch.co.nz>@biojava.org on
01/13/2003 03:08:08 PM

Sent by:    biojava-l-admin@biojava.org


To:    "Patrick McConnell" <MCCon012@mc.duke.edu>
cc:    <biojava-l@biojava.org>

Subject:    RE: [Biojava-l] library for running blast and formatdb

One thing sorely missing from BioJava is the ability to launch and
capture the results of common bioinformatics programs. I know Java isn't
the best at this but it's not that bad. It's also needed if you want to
develop pipeline type applications.

Would it be possible to get some kind of over-arching interface based
API so that services can be made available with similar interfaces.

Possibly a Service or Program interface a Paramater list or map, some
kind of result stream?

Just my $0.02

- Mark

> -----Original Message-----
> From: Patrick McConnell [mailto:MCCon012@mc.duke.edu]
> Sent: Tuesday, 14 January 2003 4:15 a.m.
> To: biojava-l@biojava.org
> Subject: Re: [Biojava-l] library for running blast and formatdb
>
>
>
>
> >I suppose it's a matter of another external dependency vs.
> reinvented
> >utility code in biojava . . .  Would it make sense to merge
> the better
> >qualities of the two?
>
> The CLI project looks like it is quite flexible and robust.
> But, with this, it is somewhat complex.  This is in contrast
> to the simplicity of creating parameters via reflection.  I
> think that these two methods could be effectively combined so
> that we gain the simplicty of reflection with the flexibility
> of CLI.  The base parameters class can use CLI to build its
> parameters.  As an option, it can build CLI options via
> reflection for simplicity.  When users extend the base class,
> they can utilize the flexibility of CLI if they need it,
> otherwise they can use reflection for a quick and dirty
> parameter parsing.  The base class could even extend the
> Options class, so we are really working with a hybrid of the
> two.  What does everyone think?
>
> -Patrick
>
>
>
>
>
>
> "Michael L. Heuer" <heuermh@acm.org>@shell3.shore.net> on
> 01/10/2003 05:18:52 PM
>
> Sent by:    Michael Heuer <heuermh@shell3.shore.net>
>
>
> To:    Patrick McConnell <MCCon012@mc.duke.edu>
> cc:    biojava-l@biojava.org
>
> Subject:    Re: [Biojava-l] library for running blast and formatdb
>
>
> On Fri, 10 Jan 2003, Patrick McConnell wrote:
>
> > In the process, I developed some useful and flexible base
> classes for
> > formatting parameters and running programs.  Parameters are
> > automatically converted to an argument array via reflection and
> > reading of standard out and standard error in separate threads is
> > handled automatically.
>
> The base classes are nice, but I prefer the design of
>
> > http://jakarta.apache.org/commons/cli
>
> a lot better for handling parameters.
>
> I suppose it's a matter of another external dependency vs.
> reinvented utility code in biojava . . .  Would it make sense
> to merge the better qualities of the two?
>
> I also have a few simple classes for oneoff scripts with
> command line & logging facade support that I use all the time, see
>
> > http://www.shore.net/~heuermh/oneoff.tar.gz
>
> but they don't have any extra support for external programs.
>
>    michael
>
> >
> > Check it out if you are interested:
> > http://www.dbsr.duke.edu/software/blast/default.htm .  The full
> > source, javadocs, and binary class files are available.
> Also, if this
> > seems appropriate for BioJava, I have no problem donating it to the
> > cause.  I think that at least the base classes, or some
> modification
> > of them, would be useful to others.
> >
> > Please email me with suggestions/comments,
> >
> > -Patrick McConnell
> > Duke Bioinformatics Shared Resource
> > mccon012@mc.duke.edu
> >
> >
> > _______________________________________________
> > Biojava-l mailing list  -  Biojava-l@biojava.org
> > http://biojava.org/mailman/listinfo/biojava-l
> >
>
>
>
>
>
>
>
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
>
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================

_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l