[Biojava-l] Serialization and 1.2

Thomas Down td2@sanger.ac.uk
Wed, 23 Jan 2002 16:31:18 +0000


On Wed, Jan 23, 2002 at 04:47:14PM +1300, Schreiber, Mark wrote:
> 
> > Um, I should think so.  If turns out that it's going to take
> > a while, we can always put it into 1.2.1 instead (aside: I'm
> > hoping that the 1.2 branch will be rather more active than 1.1
> > was -- I don't think we're going to see the same degree of core
> > interface changes which made porting bug fixes between branch-1_1
> > and the HEAD difficult).
> > 
> > As far as I can remember, the problem is not so much
> > rebuilding the AlphabetIndexes as ensuring that the serialized
> > form isn't dependent on the indexes.  Easiest way would
> > seem to be adding explicit writeObject/readObject methods,
> > then serializing as a list of (Symbol, double) tuples.
> > 
> Not quite sure I understand this, are you saying that when an object is
> serialized it should record its AlphabetIndexer as a list of Symbol,
> integer? tuples that will remake an internal copy of an AlphabetIndexer
> post serialization?

Not really -- I tend to feel that the AlphabetIndexer probably
shouldn't be recorded /at all/.  It's just a convenience
for in-memory representation of data keyed by Symbols.  Rather
like the notion of a hash-table is only a convenience for
implementing a generic Map.  If you read the serialized
form documentation for HashMap, it explicitly states that
the map is serialized as key/value pairs -- the hash structure
is not present in the serialized form.

> If this is the case then wont any class that is serilizable but contains
> a reference to an AlphabetIndexer will have to be tweaked individually
> to specify writeObject/readObject methods?

Yes, this is the downside of doing things this way.  That
said, I don't think the code for this would be excessive.
And it could probably be placed in AbstractDistribution.

The alternative is making the AlphabetIndex implementations
realiably serializable.  This has two consequences:

  - Serialized distributions get tied to a specific
    AlphabetIndex implementation as well as a specific
    Distribution implementation, making them less likely
    to deserialize with future versions of the code.

  - It would no longer be possible to (within the scope of a
    VM) ensure a one-to-one mapping between Alphabets and
    AlphabetIndexes.

Neither of these would be disasterous, if you wanted to
do things this way.  It doesn't strike me as being quite
such an elegant solution though.

   Thomas.