[Biojava-l] some comments and wish list

Matthew Pocock mrp@sanger.ac.uk
Mon, 29 Jan 2001 18:18:00 +0000


Hi Bruce,

Thanks for your comments. I think that the main reason that BioJava lags 
behind BioPerl in features is that BioPerl was/is developed while being 
actively used to implemnet many real-life sequence analysis pipelines, 
where as much of BioJava was developed by Thomas and I to adress the 
code-reuse issues that came up while we were working on our PhDs. 
BioJava will never be as quick to use, robust or as feature-complete as 
BioPerl untill developers use it daily for genome analysis pipelines. On 
the other hand, it is not fleshed out enough to do this job at the 
moment, so the brave pioneres will have a body of code to write - catch 22.

The flip side of this, is that we have been able to explore elegant 
objects without the constraints of it having to work 24/7, so we are now 
in a position to add lots of functionality to a relatively clean 
framework (which doesn't have much inherant lava-flow design). I think 
we trust our sequence object model to work & scale to the genomic and 
into comparative-genomics without a lot of heart-ache - we have been 
stress-testing by loading in the human genome this week - my 128mb pc 
seems to handle this fine.

We are not stingey with read/write CVS access, so if you have code to 
add, or wish to help maintain existing code, we are more than happy to 
have you. A GAME parser would be great, as would any search program 
parsers (either using the SAX-based parsers or the search result 
objects). We realy are just the sum of our contributers.

All the best,

Matthew