[Biojava-l] A problem parsing Blast XML output (blastN vs. blastP)

Benoit VARVENNE varvenne at genoway.com
Fri Nov 24 16:19:27 UTC 2006


I'm parsing blast results using biojava1.5 and a BlastXMLParserFacade with
the code put at the end of this mail.

I've tried this with a blastN query and there i got no trouble.
However, i've tried to do exactly the same thing with a BlastP query and
i've got the exception cited at the end of this mail.

I've verified and the two infiles (blastn/blastp) seem to have the same
structures (except that one is for prot so data are different). (Please find
them as attached if you're used to this).

Can someone help me ? I don't understand why it works in a case and not in
the other one ...

Thanks a lot,


My code :
InputStream is = new FileInputStream(blastFile);
// blastFile is the xml file, output of my blast

      //make a BlastLikeSAXParser
      BlastXMLParserFacade parser = new BlastXMLParserFacade();
      //make the SAX event adapter that will pass events to a Handler.
      SeqSimilarityAdapter adapter = new SeqSimilarityAdapter();

      //set the parsers SAX event adapter

      //The list to hold the SeqSimilaritySearchResults
      List results = new ArrayList();

      //create the SearchContentHandler that will build
      //in the results List
      SearchContentHandler builder = new BlastLikeSearchBuilder(results,
          new DummySequenceDB("queries"), new

      //register builder with adapter

      parser.parse(new InputSource(is)); // From here come the Exception


The exception :
  org.xml.sax.SAXParseException: Une d?claration XML peut commencer des
entit?s uniquement.
        at org.apache.crimson.parser.Parser2.fatal(Parser2.java:3376)
        at org.apache.crimson.parser.Parser2.fatal(Parser2.java:3364)
        at org.apache.crimson.parser.Parser2.maybePI(Parser2.java:1140)
        at org.apache.crimson.parser.Parser2.maybeMisc(Parser2.java:1266)
        at org.apache.crimson.parser.Parser2.parseInternal(Parser2.java:671)
        at org.apache.crimson.parser.Parser2.parse(Parser2.java:337)

More information about the Biojava-l mailing list