[Biojava-l] SAX parser demo

jinchen at ufl.edu jinchen at ufl.edu
Sun Jun 29 15:00:14 EDT 2003


Sorry for bothering on this topic again. I have one sample in my zip file. My
simple XML parser simply works for my sample xml. However, I got this error when
I try to use your parser:

staxenv org.biojava.bio.program.sax.blastxml.BlastXMLParser at 1cd2e5f
org.xml.sax.SAXParseException: The markup declarations contained or pointed to
by the document type declaration must be well-formed.
        at org.apache.xerces.framework.XMLParser.reportError(XMLParser.java:1060)
        at
org.apache.xerces.framework.XMLDTDScanner.reportFatalXMLError(XMLDTDScanner.java:651)
        at
org.apache.xerces.framework.XMLDTDScanner.scanDecls(XMLDTDScanner.java:1523)
        at
org.apache.xerces.framework.XMLDocumentScanner.scanDoctypeDecl(XMLDocumentScanner.java:2199)
        at
org.apache.xerces.framework.XMLDocumentScanner.access$0(XMLDocumentScanner.java:2152)
        at
org.apache.xerces.framework.XMLDocumentScanner$PrologDispatcher.dispatch(XMLDocumentScanner.java:883)
        at
org.apache.xerces.framework.XMLDocumentScanner.parseSome(XMLDocumentScanner.java:381)
        at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:952)
        at
org.biojava.bio.program.sax.blastxml.BlastXMLParserFacade.parse(BlastXMLParserFacade.java:167)
        at BlastParser3.main(BlastParser3.java:47)

Would you like to let me know why?
Thanks,
Jin

Quoting David Huen <david.huen at ntlworld.com>:

> Hi,
> OK, I have uploaded a demo to CVS.  It is at biojava-live/demos/blastxml.  
> It's just a plain ripoff of Mark Schreiber's demo in Biojava In Anger 
> ported to use the BlastXML parser.  You will need to do a "cvs update -d" 
> to create the new directories for the demos and for the DTD directory.
> 
> I have added a facade to the BlastXML parsing framework.  The facade is 
> called BlastXMLParserFacade and is used identically to the way the existing 
> BlastLikeSAXParser is used with blast text output.  I think this will make 
> it easier for users all round: that both have the same interface.  You can 
> look in that class to see how the BJ parsing framework is actually set up.
> 
> I won't have more time available to work on this for a bit but bug reports 
> are welcome for eventual fixes.  As previously mentioned, running multiple 
> sequence queries on a database with NCBI blast results in the concatenation 
> of all the Blast XML outputs resulting in an almighty completely non-XML 
> compliant file (multiple <xml> and <DOCTYPE> elements for example).  
> Parsing those requires a hack I have previously described but it is ugly, 
> ugly, ugly.  Maybe the latest NCBI version might have fixed this problem 
> but I haven't looked.
> 
> Best wishes,
> David Huen
> P.S. It is really really bedtime, guys.....
> P.P.S There is an ugly entity resolver hack I will need to clean up later 
> too.
> 
> 




More information about the Biojava-l mailing list