[Biojava-l] LightWeight vs HeavyWeight sequence objects

Thomas Down td2@sanger.ac.uk
Wed, 26 Jan 2000 09:31:09 +0000


On Wed, Jan 26, 2000 at 01:23:02AM -0600, Mike Marsh wrote:
> 
> A String is just a fancy wrapper around an array of unicode characters
> (2 bytes per char).  The OO approach is just a wrapper around an array of
> pointers to static SequenceChar objects (8-16 bytes per pointer, depending
> on your system's architechture).  The OO approach requires one addional
> step of dereferencing the pointer.  Dereferencing is cheap:  it's simply
> pushing around integers and fetching from memory.

Indeed.  In fact, probably only 4-8 bytes, unless people have
been building 128-bit address space machines while I wasn't looking.
Also, the pointers won't even need to be dereferenced all that often.
Comparing two Residue (ProteinChar/whatever) objects can be done
just by comparing the pointers.

> PS I would like to reinforce that all of the functionality of Strings can
> easily be added to an object approach.  With appropriate wrappers,
> Sequence can "implement" a "String interface", by offering 
>   public char charAt(int index)
>   public String concat(String str)
>   public String substring(int beginIndex, int endIndex)
>   ...

Sure -- although I'd prefer to see the `first class object' and
stringy interfaces kept separate -- probably the way to go is
to use our own first class interfaces, and adopt the stringy 
interfaces from the OMG BioObjects (which should buy us easy
interoperability with other languages).

It should then be quite easy to write two `bridge' implementations,
one of which implements the OMG interface backed by a wrapped BioJava
object, and another which wraps an OMG object in the BioJava style.

Thomas.
-- 
``Science is magic that works''  -- Kurt Vonnegut.