[Biojava-l] SAX parser demo

Russell Smithies russell.smithies at xtra.co.nz
Wed Jun 25 17:46:05 EDT 2003


Looks good but doesn't do what I need but I don't think it was ever going to
:-(

The blast XML data has loads of info in it (I guess thats the reason for the
format) but I want to be able to get at individual tags, not just hits.  For
example, some of the stats data (Statistics_entropy, Statistics_eff-space
etc.) or other hit data (Hsp_align-len, Hsp_pattern-from etc.) instead of
just hitID and e-value might be useful?
I guess I'll have to implement some new bits (from
SimpleSeqSimilaritySearchSubHit?) but not exactly sure where.

any ideas?

thanx
Russell

----- Original Message -----
From: "David Huen" <david.huen at ntlworld.com>
To: "Russell Smithies" <russell.smithies at xtra.co.nz>; "Biojava-L at Biojava.
Org" <biojava-l at biojava.org>
Cc: <jinchen at ufl.edu>
Sent: Wednesday, June 25, 2003 2:28 PM
Subject: Re: [Biojava-l] SAX parser demo


> Hi,
> OK, I have uploaded a demo to CVS.  It is at biojava-live/demos/blastxml.
> It's just a plain ripoff of Mark Schreiber's demo in Biojava In Anger
> ported to use the BlastXML parser.  You will need to do a "cvs update -d"
> to create the new directories for the demos and for the DTD directory.
>
> I have added a facade to the BlastXML parsing framework.  The facade is
> called BlastXMLParserFacade and is used identically to the way the
existing
> BlastLikeSAXParser is used with blast text output.  I think this will make
> it easier for users all round: that both have the same interface.  You can
> look in that class to see how the BJ parsing framework is actually set up.
>
> I won't have more time available to work on this for a bit but bug reports
> are welcome for eventual fixes.  As previously mentioned, running multiple
> sequence queries on a database with NCBI blast results in the
concatenation
> of all the Blast XML outputs resulting in an almighty completely non-XML
> compliant file (multiple <xml> and <DOCTYPE> elements for example).
> Parsing those requires a hack I have previously described but it is ugly,
> ugly, ugly.  Maybe the latest NCBI version might have fixed this problem
> but I haven't looked.
>
> Best wishes,
> David Huen
> P.S. It is really really bedtime, guys.....
> P.P.S There is an ugly entity resolver hack I will need to clean up later
> too.
>




More information about the Biojava-l mailing list