[Biojava-l] blast xml parser

xling xling@tularik.com
Thu, 7 Jun 2001 22:58:57 -0700


Hi,

I just came back from San Francisco JAVA ONE conference.  One of the talk is
about xml java binding.

Sun has just released the  http://java.sun.com/xml/jaxb/index.html which is
trying to do the xml java binding.

This makes me think of the biojava blast parser.  I have to admit that there
is some significant learning curve for me to get comfortable with the
biojava SAX parser. Even after I have stepped through the parser
implementation code and knows exactly how the implementation works, still I
found it really kind of little help in doing the blast parsing (extract
alignment, start and end etc information) if no further work is devoted.
Compare to bioperl, the demo code is just a "proof of concept" rather than
the implementation library can be of real use.  Correct me if I am wrong.
Thus far biojava has not provided utilities in xml binding as it is not from
xml but from raw blast result and use SAX parser as a general tool to do the
parsing. The objects binding part after parsing is missing.  I am not sure
anyone in the mailing list has really put the biojava sax  blast parser in
real pratice. If you have done this, please share your experience with me.


In the past, this may make sense.  But now ncbi blast utility can provide
xml format result. I think biojava really should embrace this to do the xml
binding to the current biojava objects.  Or use jaxb package to instantiate
intermediate objects.  For parsing purposes, the intermediate objects may be
good enough. I am thinking about trying this when I have some spare time to
take a spin on the jaxb stuff.

Please comment on this. As far as I am concerned, the bioinformatics result
I/O parsing including blast and similar tools is kind of critical.


Bruce Ling, Ph.D.
Tularik, Inc.
http://www.tularik.com