[Biojava-dev] Re: [Biojava-l] Parsing MegaBLAST output files?

mark.schreiber at group.novartis.com mark.schreiber at group.novartis.com
Tue Nov 23 20:43:06 EST 2004


Hello All -

This sounds like a genuine bug in the BlastLikeSaxParser that only happens 
with MegaBlast when lazyParsing is turned on. Could someone patch this (it 
could take me a long time to get around to it).

- Mark

----- Forwarded by Mark Schreiber/GP/Novartis on 11/24/2004 09:41 AM -----


James Diggans <jdiggans at excelsiortech.com>
11/23/2004 10:12 PM

 
        To:     Mark Schreiber/GP/Novartis at PH
        cc: 
        Subject:        Re: [Biojava-l] Parsing MegaBLAST output files?


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Yes, I'm using the biojava-1.4pre3 jar file. Looking into things a bit
more in CVS, you're correct in that the lazy mode does accept MegaBLAST
but version checking is done in *two* places and the second of which is
failing for me. In BlastLikeSAXParser.java, the relevant section is:

- -------------
tValidFormat = oVersion.assignProgramAndVersion(poLine);

if (!oVersion.isSupported()) {
                 throw (new SAXException("Program "
                                 + oVersion.getProgramString()
                                 + " Version "
                                 + oVersion.getVersionString()
                                 " is not supported by the biojava 
blast-like "
                                 "parsing framework"));
}
- -------------

In my case, tValidFormat is set to *false* by
BlastLikeVersionSupport.java while oVersion.isSupported() returns true
due to the lazy flag. The SAXException thrown above is *not* the one I'm
seeing. The one I'm seeing is actually much later on after the BLAST
file has been completely parsed (line 181 in BlastLikeSAXParser.java)
that relies on the return value of tValidFormat that was set to false by
the BlastLikeVersionSupport instance.

So it seems there are *two* format checks and, in the case of MegaBLAST,
it's the second that fails. The second return value from:

                 oVersion.assignProgramAndVersion(poLine);

has no provision for laziness. Hope that sheds a little more like; my
thanks for your attention. I've pasted the stack trace below.
- -j

45859 [main] ERROR [my package].MegaBLASTGenerator  - SAX event mangled:
Could not recognise the format of this file as one supported by the
framework.
org.xml.sax.SAXException: Could not recognise the format of this file as
one supported by the framework.
                 at
org.biojava.bio.program.sax.BlastLikeSAXParser.parse(BlastLikeSAXParser.java:182)
                 at [my
package].MegaBLASTGenerator.parseResults(MegaBLASTGenerator.java:140)
                 at [my
package].MegaBLASTGenerator.executeAlignment(MegaBLASTGenerator.java:103)
                 at [my
package].MegaBLASTGeneratorTest.testSmallMegaBLAST(MegaBLASTGeneratorTest.java:44)
                 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
                 at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown 
Source)
                 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
                 at java.lang.reflect.Method.invoke(Unknown Source)
                 at junit.framework.TestCase.runTest(TestCase.java:154)
                 at junit.framework.TestCase.runBare(TestCase.java:127)
                 at 
junit.framework.TestResult$1.protect(TestResult.java:106)
                 at 
junit.framework.TestResult.runProtected(TestResult.java:124)
                 at junit.framework.TestResult.run(TestResult.java:109)
                 at junit.framework.TestCase.run(TestCase.java:118)
                 at junit.framework.TestSuite.runTest(TestSuite.java:208)
                 at junit.framework.TestSuite.run(TestSuite.java:203)
                 at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:421)
                 at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:305)
                 at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:186)


mark.schreiber at group.novartis.com wrote:
| That sounds like an old bug I thought was fixed. Are you using a recent
| snapshot? If not you could get one from the biojava website or CVS. It
may
| however be still looking for somekind of header line or at least
expecting
| one even if it ignores it. You can quite happily spoof the header by 
just
| adding one.
|
| If that doesn't work can you send the stack trace for the exception?
|
| - Mark
|
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3-nr1 (Windows XP)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBo0Uw75jgGJzUhNkRAosRAJ9Y2YHJZg4UVHVPITgqpQcKydwMowCfcyyx
uUf3K2twq874OXltIpgKcTM=
=fel3
-----END PGP SIGNATURE-----




More information about the biojava-dev mailing list