[Biojava-l] Blast?

Simon Brocklehurst simon.brocklehurst@CambridgeAntibody.com
Wed, 28 Feb 2001 17:50:30 +0000

Hi Mathieu,

"Wiepert, Mathieu" wrote:

> I currently have so .csh scripts that fork off as many blasts as there are
> sequences or sequence files in a list.

I don't think you have to do this - you can simply throw all the sequences at
blast in one go i.e. one blast process.

>  Anybody doing something like this?  Any better solutions, anyone change the
> blast itself?  Might be better to change the source of the data rather than
> parse the output, but not sure how feasible that is either.

Biojava has support for making it easy to use the output from Blast (including
multiple results from one blast process).  If you want to know more, have a look


>From the JavaDoc,

A facade class allowing for direct SAX2-like parsing of the native output from
Blast-like bioinformatics software. Because the parser is SAX2 compliant,
application writers can simply pass XML ContentHandlers to the parser in order
to receive notifcation of SAX2

The SAX2 events produced are as if the input to the parser was an XML file
validating against the biojava BlastLikeDataSetCollection DTD. There is no
requirement for an intermediate conversion of natve output to XML format. An
application of the parsing framework,
however, is to create XML format files from native output files.

The biojava Blast-like parsing framework is designed to uses minimal memory,so
that in principle, extremely large native outputs can be parsed and XML
ContentHandlers can listen only for small amounts of information.

and there is a basic tutorial at:


If you're interested in using this, I'm happy to help some more if need be...

Simon M. Brocklehurst, Ph.D.
Head of Bioinformatics & Advanced IS
Cambridge Antibody Technology
The Science Park, Melbourn, Cambridgeshire, UK