[Biojava-l] Parsing GenBank files in Threads
Hoebeke Mark
Mark.Hoebeke at jouy.inra.fr
Fri Apr 2 01:41:33 EST 2004
Hi Matthew,
I just finished some further investigation, strengthening my feeling
that using SeqIOTools.readGenbank() might not be thread-safe.
The strongest point is that the errors appear less frequently on
uniprocessor machines that on multiprocessor ones.
As you requested, below is a snippet of the the exception stack whith
the bio.* related part delimited by ===========. This pattern repeats
itself for different Genbank files except for the actual value of the
corrupt(?) index.
Note that the problem is solved by prefixing the method calling
Sequence.seqString() with static synchronized, but that takes all the
fun out of the pipeline ;)
If needed, I can hand you the complete source file but I thought I'd
better not spam biojava-l with it.
Thanks for your support.
Mark
[java] org.quartz.JobExecutionException: java.lang.Exception:
Unable to extract sequence from entry BA000019 [See nested exception:
java.lang.Exception: Unable to extract sequence from entry BA000019]
[java] at pipeline.jobs.EntryFeeder.execute(EntryFeeder.java:241)
[java] at org.quartz.core.JobRunShell.run(JobRunShell.java:178)
[java] at
org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:487)
[java] * Nested Exception (Underlying Cause) ---------------
[java] java.lang.Exception: Unable to extract sequence from entry
BA000019
[java] at
pipeline.jobs.EntryFeeder.feedEntry(EntryFeeder.java:199)
[java] at pipeline.jobs.EntryFeeder.execute(EntryFeeder.java:234)
[java] at org.quartz.core.JobRunShell.run(JobRunShell.java:178)
[java] at
org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:487)
===========================================================================================================
[java] java.lang.ArrayIndexOutOfBoundsException: -17406
[java] at
org.biojava.bio.symbol.PackedSymbolList.symbolAt(PackedSymbolList.java:275)
[java] at
org.biojava.bio.seq.io.ChunkedSymbolListFactory$ChunkedSymbolList.symbolAt(ChunkedSymbolListFactory.java:178)
[java] at
org.biojava.bio.symbol.AbstractSymbolList$SymbolIterator.next(AbstractSymbolList.java:191)
[java] at
org.biojava.bio.seq.io.CharacterTokenization.tokenizeSymbolList(CharacterTokenization.java:202)
[java] at
org.biojava.bio.symbol.AlphabetManager$WellKnownTokenizationWrapper.tokenizeSymbolList(AlphabetManager.java:1378)
[java] at
org.biojava.bio.symbol.AbstractSymbolList.seqString(AbstractSymbolList.java:93)
[java] at
org.biojava.bio.seq.impl.SimpleSequence.seqString(SimpleSequence.java:89)
=============================================================================================================
[java] at
pipeline.jobs.EntryFeeder.feedEntry(EntryFeeder.java:194)
[java] at pipeline.jobs.EntryFeeder.execute(EntryFeeder.java:234)
[java] at org.quartz.core.JobRunShell.run(JobRunShell.java:178)
[java] at
org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:487)
[java] java.lang.Exception: Unable to extract sequence from entry
AE004092
[java] at
pipeline.jobs.EntryFeeder.feedEntry(EntryFeeder.java:199)
[java] at pipeline.jobs.EntryFeeder.execute(EntryFeeder.java:234)
[java] at org.quartz.core.JobRunShell.run(JobRunShell.java:178)
[java] at
org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:487)
[java] 2 avr. 2004 08:12:54 org.quartz.core.JobRunShell run
Le jeu 01/04/2004 à 21:31, Matthew Pocock a écrit :
> Hi,
>
> The biojava policy on synchronization is that we try to make things safe
> if possible, but expect the user to synchronize sanely. Unfortunately,
> this is usually not documented anywhere. I could not guarantee that
> GenbankFormat is threadsafe - it would be sensible for it to be, but the
> particular implementation may not be. To help us track this, could you
> include some example stack traces of eratic behavior?
>
> Matthew
--
--------------------------Mark.Hoebeke at jouy.inra.fr----------------------
Unité Statistique & Génome Unité MIG
+33 (0)1 60 87 38 03 Tél. +33 (0)1 34 65 28 85
+33 (0)1 60 87 38 09 Fax. +33 (0)1 34 65 29 01
Tour Evry 2, 523 pl. des Terrasses INRA - Domaine de Vilvert
F - 91000 Evry F - 78352 Jouy-en-Josas CEDEX
More information about the Biojava-l
mailing list