[Biojava-l] A problem parsing Blast XML output (blastN vs. blastP)

Benoit VARVENNE varvenne at genoway.com
Fri Nov 24 16:19:27 UTC 2006


Hello,

I'm parsing blast results using biojava1.5 and a BlastXMLParserFacade with
the code put at the end of this mail.

I've tried this with a blastN query and there i got no trouble.
However, i've tried to do exactly the same thing with a BlastP query and
i've got the exception cited at the end of this mail.

I've verified and the two infiles (blastn/blastp) seem to have the same
structures (except that one is for prot so data are different). (Please find
them as attached if you're used to this).

Can someone help me ? I don't understand why it works in a case and not in
the other one ...

Thanks a lot,
Cheers,

Benoît.


-------------------
My code :
-----
InputStream is = new FileInputStream(blastFile);
// blastFile is the xml file, output of my blast

      //make a BlastLikeSAXParser
      BlastXMLParserFacade parser = new BlastXMLParserFacade();
      //make the SAX event adapter that will pass events to a Handler.
      SeqSimilarityAdapter adapter = new SeqSimilarityAdapter();

      //set the parsers SAX event adapter
      parser.setContentHandler(adapter);

      //The list to hold the SeqSimilaritySearchResults
      List results = new ArrayList();

      //create the SearchContentHandler that will build
SeqSimilaritySearchResults
      //in the results List
      SearchContentHandler builder = new BlastLikeSearchBuilder(results,
          new DummySequenceDB("queries"), new
DummySequenceDBInstallation());

      //register builder with adapter
      adapter.setSearchContentHandler(builder);

      parser.parse(new InputSource(is)); // From here come the Exception

-------------------


-------------------
The exception :
----
  org.xml.sax.SAXParseException: Une d?claration XML peut commencer des
entit?s uniquement.
        at org.apache.crimson.parser.Parser2.fatal(Parser2.java:3376)
        at org.apache.crimson.parser.Parser2.fatal(Parser2.java:3364)
        at org.apache.crimson.parser.Parser2.maybePI(Parser2.java:1140)
        at org.apache.crimson.parser.Parser2.maybeMisc(Parser2.java:1266)
        at org.apache.crimson.parser.Parser2.parseInternal(Parser2.java:671)
        at org.apache.crimson.parser.Parser2.parse(Parser2.java:337)
        at 
org.apache.crimson.parser.XMLReaderImpl.parse(XMLReaderImpl.java:448)
        at 
org.biojava.bio.program.sax.blastxml.BlastXMLParserFacade.parse(BlastXMLPars
erFacade.java:180)





More information about the Biojava-l mailing list