[Biojava-l] accessing blast and genbank

Thomas Down td2 at sanger.ac.uk
Mon Feb 10 11:34:54 EST 2003


On Mon, Feb 10, 2003 at 10:48:53AM +0000, David Huen wrote:
>
> > I have read the mailing list and found the link to the "Blast Java
> > Library" by Patrick McConnell. Can anyone tell me if this is the right
> > way to go? As I understand, an external program is called (the original
> > blast software?). Is there a reason why whatever the blast program does
> > cannot be implemented in Java?
> >
> There is no reason why Blast cannot be implemented in Java if you really 
> must have it.  But BLASTN, which I have looked into in detail, is a 
> finely-crafted piece of code in C which deploys comparison and pointer 
> tricks to get speed and it is highly unlikely you could be as fast as such 
> a piece of code in Java.  Similarly, for efficiency, you couldn't have it 
> run over the current SymbolLists but will need to use some packed 
> implementation to increase the capacity for comparing multiple symbols with 
> a single test.  In effect, to get anything remotely  near the C 
> performance, you'll need to write C-style Java and treat sequences as 
> bit-patterns, etc.
> 
> So it could be done but would anyone really want such an animal considering 
> that two perfectly fine and constantly maintained implementations exist 
> already in C?  Especially considering it will be slower and also be yet 
> another tool for us to maintain?  Perhaps someone might pick up this task 
> if sufficiently convincing arguments were raised why we might want this.

Yes, I have to agree with this -- I'd have to see pretty
compelling reasons to write yet another implementation of that
basic algorithm.  It's not too hard to Runtime.exec blast
processes from Java.

What I would add is that BioJava includes code for both exact
dynamic programming, and fast word-matching algorithms.  Based
on these two, it's possible to build quite a wide range of
`vaguely blast-like' search methods.  I certainly wouldn't
recommend building a general purpose blast clone, but if you
need something a bit different, and are more worried about
getting it up and running quickly than extracting the last
few % of performance, BioJava with the DP and SSAHA packages
might be a good choice.

     Thomas.



More information about the Biojava-l mailing list