[Biojava-l] Lazy instantiation and the Sequence interface

Thomas Down td2@sanger.ac.uk
Sun, 24 Jun 2001 12:48:02 +0100


On Sat, Jun 23, 2001 at 08:57:30PM +0100, David Huen wrote:
> 
> Since with lazy instantiation, I do not actually complete instantiation of
> the delegate Sequence objects, it is possible that exceptions may arise
> when methods in the Sequence interface is eventually invoked (e.g. the
> sequence could not be created).

I know the problems :(.  I've been here when doing biojava-ensembl
(very lazy about a lot of things, and also has possible transient-
failure modes is there's a brief glitch in the connection to
the database -- these can usually be recovered from if you retry
later).  Also in the DAS client.

> However, many of the methods do not appear to support Exceptions resumably
> because there was no prospect of that arising with the current
> implementations.

Scratch the `current implementations' -- it's just the really
naive in-memory implementations we had two years ago which were
okay...

You live and learn...

> I can see 3 possibilities:-
> 1) I fully validate all input data by creating all data objects at
> instantiation of a Ragbag tree.  This could consume huge amounts of
> resources for a large assembly and would be pointless since the cache
> behaviour will result in many of these objects being destroyed almost
> immediately.  Instantiation of a large tree could also take a very long
> time.

Also, neglects transient failures of the datastore.  We all know
that these happen...

> 2) I trap all exceptions in my methods and provide a query method to find
> out if an exception did occur.  Failure to check could mean apps using
> Ragbag as a Sequence could crash and burn. Use of Ragbag will then not be
> transparent.

I've seen this pattern used in a few places.  My personal feeling
is that it's rather ugly compared to simple use of exceptions.
But it does work okay, so long as it's implemented carefully.
Anyone else in list-land have any feelings on this?

> 3) change the Sequence interface to add a throw BioException to all
> methods. Sequence is widely used thru' out Biojava. Existing software
> depends on this interface and will not take kindly to such a change.  It
> will break compilation and doesn't seem something wise to do prior to a
> stable release.  I could hold off the new Ragbag stuff till after the
> stable release and fix this then.

If we were designing Sequence today, we wouldn't fall for this
one again...  (But that's true of all software design *sigh*).

What's currently happening in biojava-ensembl is that runtime
errors are causing BioErrors to be thrown, since that was what
was available.  I think this is wrong, since it means you actually
get code which catches Errors (which is missing the point of
Errors).  One other possibility would be:

4) Define BioRuntimeException.  Document that all the problematic
methods on Sequence (and friends) can throw this (explicitly list
it in the Javadoc).  The advantage of this is that it won't actually
break any existing code.  The downside is that it will make well-written
future code even more complex in exception handling (you'll easily
get blocks which can throw both BioException and BioRuntimeException.
This is when you wish for "public class BioRuntimeException extends
BioException, RuntimeException"....)



Anyway, since you're asking for opinions, my preferences are 3
or 4 (in that order).  I'd really like to fix the Sequence interface
(and a few others in that order), but it's going to make a significant
amount of work for people.  So some serious discussion is in order
here -- this is an issue which is just going to keep coming back
until somebody fixes it...

Thinking back to the roadmap I proposed a few days ago:

  BioJava 1.2 (coming soon):

    Add BioRuntimeException.  Encourage new code to catch this.
    Applications which don't may just bomb out.  However, programs
    which /just/ use in-memory objects (and are therefore okay at
    the moment) in practice won't be hit by this, so it's not actually
    an incompatible change.  Proper support for the runtime exetpions
    can be phased in for new code.

  BioJava 2.0 (end of this year, early next)

    We're already targetting things like universal identifiers
    and universal queryability to this release, so interfaces WILL
    change if these get landed.  This means it's potentially a lower-
    impact point to roll some of the core interfaces ( like Sequence)
    over to throwing checked exceptions.  We'll have had a chance to
    see exactly how well the BioRuntimeException model is working
    out, and make an informed decision on this.

I think it would be a shame to see Ragbag postponed because of
this, especially since this issue effects other code (including
code which is already in the project).


Anyone else have opinions on this?

   Thomas.