[Biojava-l] SAX parser demo

Schreiber, Mark mark.schreiber at agresearch.co.nz
Wed Jun 25 20:06:46 EDT 2003


Hi -
 
Depends how much you want to bind it to biojava. If you don't need biojava objects just make a SAX parser to listen for the bits you want. If you do want to bind it to biojava objects I would suggest modifying the parser to put the info into an Annotation object.
 
- Mark
 

	-----Original Message----- 
	From: Russell Smithies [mailto:russell.smithies at xtra.co.nz] 
	Sent: Wed 25/06/2003 4:46 p.m. 
	To: smh1008 at cus.cam.ac.uk; Biojava-L at Biojava. Org 
	Cc: 
	Subject: Re: [Biojava-l] SAX parser demo
	
	

	Looks good but doesn't do what I need but I don't think it was ever going to
	:-(
	
	The blast XML data has loads of info in it (I guess thats the reason for the
	format) but I want to be able to get at individual tags, not just hits.  For
	example, some of the stats data (Statistics_entropy, Statistics_eff-space
	etc.) or other hit data (Hsp_align-len, Hsp_pattern-from etc.) instead of
	just hitID and e-value might be useful?
	I guess I'll have to implement some new bits (from
	SimpleSeqSimilaritySearchSubHit?) but not exactly sure where.
	
	any ideas?
	
	thanx
	Russell
	
	----- Original Message -----
	From: "David Huen" <david.huen at ntlworld.com>
	To: "Russell Smithies" <russell.smithies at xtra.co.nz>; "Biojava-L at Biojava.
	Org" <biojava-l at biojava.org>
	Cc: <jinchen at ufl.edu>
	Sent: Wednesday, June 25, 2003 2:28 PM
	Subject: Re: [Biojava-l] SAX parser demo
	
	
	> Hi,
	> OK, I have uploaded a demo to CVS.  It is at biojava-live/demos/blastxml.
	> It's just a plain ripoff of Mark Schreiber's demo in Biojava In Anger
	> ported to use the BlastXML parser.  You will need to do a "cvs update -d"
	> to create the new directories for the demos and for the DTD directory.
	>
	> I have added a facade to the BlastXML parsing framework.  The facade is
	> called BlastXMLParserFacade and is used identically to the way the
	existing
	> BlastLikeSAXParser is used with blast text output.  I think this will make
	> it easier for users all round: that both have the same interface.  You can
	> look in that class to see how the BJ parsing framework is actually set up.
	>
	> I won't have more time available to work on this for a bit but bug reports
	> are welcome for eventual fixes.  As previously mentioned, running multiple
	> sequence queries on a database with NCBI blast results in the
	concatenation
	> of all the Blast XML outputs resulting in an almighty completely non-XML
	> compliant file (multiple <xml> and <DOCTYPE> elements for example).
	> Parsing those requires a hack I have previously described but it is ugly,
	> ugly, ugly.  Maybe the latest NCBI version might have fixed this problem
	> but I haven't looked.
	>
	> Best wishes,
	> David Huen
	> P.S. It is really really bedtime, guys.....
	> P.P.S There is an ugly entity resolver hack I will need to clean up later
	> too.
	>
	
	
	_______________________________________________
	Biojava-l mailing list  -  Biojava-l at biojava.org
	http://biojava.org/mailman/listinfo/biojava-l
	


=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================



More information about the Biojava-l mailing list