[Biojava-l] Parsing MegaBLAST output files?

mark.schreiber at group.novartis.com mark.schreiber at group.novartis.com
Mon Nov 22 19:45:38 EST 2004


Hello -

MegaBLAST is not offcially supported. This doesn't mean it won't work it 
just means we don't know if it will work. If it isn't too different from 
normal blast it probably will.

The BlastLikeSAXParser has two modes. Lazy and Strict. If you call 
setModeLazy() before parsing it won't care if it doesn't recognise the 
format as one that is tried and tested and will attempt to parse it 
anyway. You should carefully check a few results though to make sure it is 
going well. If things work let us know so we can add MegaBLAST to the list 
of trusted programs.

Hope this helps,

Mark





James Diggans <jdiggans at excelsiortech.com>
Sent by: biojava-l-bounces at portal.open-bio.org
11/22/2004 02:38 PM

 
        To:     BioJava <biojava-l at biojava.org>
        cc:     (bcc: Mark Schreiber/GP/Novartis)
        Subject:        [Biojava-l] Parsing MegaBLAST output files?


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



All, I'm attempting to use BioJava to parse the output from NCBI's
commandline MegaBLAST and receiving an error:

'Could not recognise the format of this file as one supported by the
framework.'

in a SAXException thrown by BlastLikeSAXParser. An old post to the
mailing list:

http://www.biojava.org/pipermail/biojava-dev/2002-October/000150.html

seems to indicate that this was fixed long ago via this commit to CVS:

http://cvs.biojava.org/cgi-bin/viewcvs/viewcvs.cgi/biojava-live/src/org/biojava/bio/program/ssbind/HeaderStAXHandler.java.diff?r1=1.3&r2=1.4&cvsroot=biojava

The MegaBLAST file I'm trying to parse is clean and my attempt at a
parse consists of (largely pulled from the recipe from BioJava in Anger):

- ------------------
InputStream is = new FileInputStream(blastResult);

BlastLikeSAXParser parser = new BlastLikeSAXParser();
SeqSimilarityAdapter adapter = new SeqSimilarityAdapter();
parser.setContentHandler(adapter);

alignmentResults = new ArrayList();
SearchContentHandler builder = new
                 BlastLikeSearchBuilder(alignmentResults,
~                new DummySequenceDB("queries"),
                                 new DummySequenceDBInstallation());

adapter.setSearchContentHandler(builder);

parser.parse(new InputSource(is));
- ------------------

Any ideas on why I'm getting the SAXException? Thanks ...
- -j

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3-nr1 (Windows XP)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBoYlc75jgGJzUhNkRAu8zAJ9gTNoPouk4/29EDpWKcQVx5EB34gCg2MkD
DndldC3zi3bD2QKWgqMNOxs=
=TS47
-----END PGP SIGNATURE-----
_______________________________________________
Biojava-l mailing list  -  Biojava-l at biojava.org
http://biojava.org/mailman/listinfo/biojava-l





More information about the Biojava-l mailing list