[Biojava-l] persistence - and the problems with it

Gerald Loeffler Gerald.Loeffler@vienna.at
Wed, 03 May 2000 18:39:39 +0200


Thomas Down wrote:
> 
> On Wed, May 03, 2000 at 03:45:37PM +0100, Simon Brocklehurst wrote:
> > >
> > > 1) Java Serialisation is a very bad way of making objects persistent.
> >
> > Agreed!  There are just sooooooo many bad things about Java serialization...
> 
> Agreed up to a point.  It's certainly no panacea, but, to be fair,
> it works pretty well for a lot of cases where you want short-term
> persistance for data from simple programs.  (Hey, it's got me out
> of trouble plenty of times...).  I'd like to see everything in
> Java that /can/ reasonably be serialized marked as Serializable
> (for a start, that allows distributed biojava apps using RMI).

absolutely - Java Serialisation is great for what it was intended,
namely the painless, short-term reading/writing of (few) objects from/to
a stream - like you need to do in RMI. It's absolutely no substitute for
a database, though.

> 
> Of course, this isn't a reason for not developing more sophisticated
> persistance mechanisms for the cases where they're more appropriate.
> 
> > You didn't discuss using XML representations of  biojava objects.  That
> > might offer a reasonable way to allow a wide variety of types of user to
> > exploit biojava. Once you have the XML you can do what you like with it...
> 
> XML is probably my preferred method for a lot of long-term/cross-application
> persistance functions.  BioJava is already using XML a little bit:
> take a look at the XmlMarkovModel class.  I expect this will grow,
> but when there isn't an existing XML grammer which fulfils the
> requirements of a particular BioJava object, a bit of care is needed
> to create a new grammer which will be widely accepted.

Regardless of all the niceties of XML, DTDs and handling DOMs from Java,
we have to face the fact that XML is in essence just another way of
defining flat file formats - fancy, easy-to-use file formats, granted,
but flat files nevertheless. As such, an XML representation of an object
graph suffers from many of the same drawbacks that other flat file
representations suffer from (especially in contrast to a database
representation of the same object graph): no datatypes (everything is a
string); no transaction safety (isolation of access); no query
capabilities against the data; ...

Additionally, XML representations tend to be verbous - so you need to
compress on the fly.

All this makes XML IMHO a very nice vehicle for the transient, portable,
platform-neutral representation of data (e.g. for database
import/export) but makes the idea of building a datastore of objects in
XML not really much more attractive than it would be in any other flat
file format.

Oh yes: and just as it is tedious to convert Java objects to/from a
relational database representation, it is tedious to convert Java
objects to/from an XML representation...

On the other hand - the world would be a better place if we just had
GenBank in an XML representation based on a good DTD (-:

	all the best,
	gerald

> 
> Thomas.
> --
> There are whose study is of smells
> And to attentive schools rehearse
> How something mixed with something else
> Makes something worse.

-- 
   Gerald.Loeffler@vienna.at _________________ Software Architect
   http://www.imp.univie.ac.at ____ http://www.daemonstration.com
   OOA&D, Java, J2EE, JSP, Servlets, JavaBeans, ODBMS, RDBMS, XML