[Biojava-l] SAX parser demo
Russell Smithies
russell.smithies at xtra.co.nz
Wed Jun 25 17:46:05 EDT 2003
Looks good but doesn't do what I need but I don't think it was ever going to
:-(
The blast XML data has loads of info in it (I guess thats the reason for the
format) but I want to be able to get at individual tags, not just hits. For
example, some of the stats data (Statistics_entropy, Statistics_eff-space
etc.) or other hit data (Hsp_align-len, Hsp_pattern-from etc.) instead of
just hitID and e-value might be useful?
I guess I'll have to implement some new bits (from
SimpleSeqSimilaritySearchSubHit?) but not exactly sure where.
any ideas?
thanx
Russell
----- Original Message -----
From: "David Huen" <david.huen at ntlworld.com>
To: "Russell Smithies" <russell.smithies at xtra.co.nz>; "Biojava-L at Biojava.
Org" <biojava-l at biojava.org>
Cc: <jinchen at ufl.edu>
Sent: Wednesday, June 25, 2003 2:28 PM
Subject: Re: [Biojava-l] SAX parser demo
> Hi,
> OK, I have uploaded a demo to CVS. It is at biojava-live/demos/blastxml.
> It's just a plain ripoff of Mark Schreiber's demo in Biojava In Anger
> ported to use the BlastXML parser. You will need to do a "cvs update -d"
> to create the new directories for the demos and for the DTD directory.
>
> I have added a facade to the BlastXML parsing framework. The facade is
> called BlastXMLParserFacade and is used identically to the way the
existing
> BlastLikeSAXParser is used with blast text output. I think this will make
> it easier for users all round: that both have the same interface. You can
> look in that class to see how the BJ parsing framework is actually set up.
>
> I won't have more time available to work on this for a bit but bug reports
> are welcome for eventual fixes. As previously mentioned, running multiple
> sequence queries on a database with NCBI blast results in the
concatenation
> of all the Blast XML outputs resulting in an almighty completely non-XML
> compliant file (multiple <xml> and <DOCTYPE> elements for example).
> Parsing those requires a hack I have previously described but it is ugly,
> ugly, ugly. Maybe the latest NCBI version might have fixed this problem
> but I haven't looked.
>
> Best wishes,
> David Huen
> P.S. It is really really bedtime, guys.....
> P.P.S There is an ugly entity resolver hack I will need to clean up later
> too.
>
More information about the Biojava-l
mailing list