[Biojava-dev] Why BJ3 should be multithreaded

Mark Schreiber markjschreiber at gmail.com
Wed Apr 9 13:12:52 UTC 2008


> > Personally though I'd still air on the side of caution WRT multi-threading and not to have it as part of the default tools but as an Object I can instantiate to do my multi-threading work (so it's a choice at the user's level rather than the framework level). Then using the Java5 executor framework we let users submit work to pools of threads to do their work. Couple this with forcing us to pass around immutable messages between threads/callables (since values shared by threads are probably the number one cause of **** ups) you'll have one heck of a kick-ass scalable framework ;-)
> >
> > Andy


One area where you could get an interesting mixture of stateless and
synchronized access to a mutable would be threaded parsing of large
sequence files.  In my experience the BioJava parsers are not normally
I/O bound due to all the object building they do.  Given this a
filereader could for example read a feature block and hand it off to a
threaded stateless feature handler which produces a Feature object and
then adds it (synchronized) to the BioJava Sequence that is being
built. As long as I/O doesn't limit then you would get improved
parsing performance.  It would also be a case where the threading
should happen internally as it could be pretty hard to coordinate the
process from the outside.

This also highlights the difference between encapsulation and
immutability. Even if access to variables is controlled by package and
protected setters the class is still mutable (but not by the user).
Immutability can only be achieved by not providing any setter methods
which has obvious severe limitations.  Currently BioJava Sequence
objects have restricted mutability (use of Edit objects) but are
certainly not immutable.

Again messages need not be immutable as long as they have appropriate
locks and or synchronized getters and setters.  Many java frameworks
work best when messages or DTO's are beans (with parameterless
constructors and public getters and setters), being able to use these
is often very desirable. These beans can still be threadsafe if you
code them right.

- Mark



More information about the biojava-dev mailing list