[Biojava-l] Job / Task Scheduler for Biojava (Webservice)

Schreiber, Mark mark.schreiber at agresearch.co.nz
Tue Nov 11 16:46:00 EST 2003

Hi Ralf,

I think this is an interesting proposal. I definitely think if you want to do this properly you would need to back it with J2EE technology and blend in some biojava where appropriate. We have been doing quite a bit of work recently to make biojava more able to play with j2ee, esp on the serialization side of things. These updates will be available in biojava1.3.1 which will be out soon.

I've made some more comments below.

> Starting with L.Stein's commentary in Nature "Creating a 
> Bioinformatics Nation" and by reading the available material 
> on the OmniGene Project one might have guessed that Java 
> would be an ideal Platform for a new generation of data and 
> task integrating Middleware Software.

I think your right. You would need something like j2ee to make it bullet proof if you envision multiple transactions with multiple clients, especially if any of them have write access to your data. This is probably beyond Perl. You could use .NET but then you are tied to one OS and you won't be able to easily use bioperl or biojava.

> However the OmniGene effort has been transferred into the 
> non/public corporate space and even before there was no 
> widespread adoption of this platform (judged by the 
> sourceforge traffic, the lack of citations..)

Are you shure it's no longer open source? I'm surprised.

> Recently I discovered the BioPipe project and its 
> accompanying publication in Genome Research. The project is 
> mature, tightly integrated with Bioperl and allmost completly 
> fullfills the above stated requirements.
> However BioPipe is based on Perl and now I wonder if Java 
> would not be more advantageous as a platform of this kind.

BioPipe is a protocol definition. The core engine is written in Perl/ BioPerl. It may be possible to write a BioPipe engine in Java although I've thought about this and I wonder if the BioPipe schema may be a bit Perl centric. Even so if you do make a enterprise bioinformatics system based on Java then a worthy goal would be making a module that can process and execute BioPipe protocols.

> I will try to list the advantages of JAVA and Perl in this 
> application below and hope for your comments:
> (1)Compared to Perl Java has advanced Object Orientation 
> support which allows for more transparent and modular 
> architectures. Development tools like Eclipse/Omodo-UML even 
> increase this advantage.

True, if you do it right.
> (2)Component Transaction Monitors like the Application Server JBOSS
> (j2ee,ejb) are an ideal platform for the Management of 
> multiple user / multiple task scenarios. The j2ee-technology 
> is successfully used in many similar applications in other 
> industries. Advanced client applications could really benefit 
> form Object Remoting provided by the J2ee Platform. 

Very true. I think to do it any other way would be to reinvent the wheel and cause several major headaches. This would be the strongest argument for using Java.

> on my limited knowledge the Java Platform appears to have a 
> much tighter (more failsafe?) incorporation of XML 
> (XML-Schema - class binding with JAXB) and Webservice 
> Technologies (SOAP) (Apache Tomcat/AXIS).

Also true, unfortunately the code gets a bit bloated. Compare an Axis Soap application to a Perl or Python one. Fortunately a lot of this code is biolerplate stuff that is easily autogenerated and doesn't need much maintaining.

> (4) There are several workflow design and management tools 
> even with graphic editors. Integration of this j2ee based 
> projects might allow big advantages to this part. 

I don't have much experience here so can't comment

> I see 2 major disadvantages for Java:
> (1) bioinformatics tools are typically command line tools. 
> The Perl on Unix platform is the best way to invoke such 
> tools from a program. Java's platform independence appears to 
> be the source for its weakness in this field.

True but BioJava has introduced org.biojava.utils.ExecRunner classes to execute other applications which seems to perform very well. Currently it's in biojava-live. I think it should be able to be transferred to biojava 1.3.1 though.

> (2) the bioperl project has a far bigger codebase, and more 
> contributors than any JAVA Bioinformatics efforts like 
> Biojava and Omnigene.

True, biojava is growing though.

> I wonder if Java will ever become a significant technology 
> for public / open source bioinformatics projects? It seems 
> like the existing headstart perl based projects now have 
> outweighs any advantages the Java Technology offers.

Who knows. Almost everyone who comes through a university computer or bioinformatics program will be taught Java and possibly Perl. Java is much more attactive for industry and there have been some useful additions to biojava from industry sources. Perl has had the advantage of been a text processing language that lends itself to bioinformatics. I'm in awe of the people who use perl for large scale projects. Seems like a nightmare to me.

- Mark
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.

More information about the Biojava-l mailing list