[Biojava-l] object persistence

Chris Mungall cjm@fruitfly.bdgp.berkeley.edu
Wed, 3 May 2000 10:48:54 -0700 (PDT)


Gerald Loeffler wrote:

> 
> Thomas Down wrote:
> > 
> > On Wed, May 03, 2000 at 03:45:37PM +0100, Simon Brocklehurst wrote:
> > > >
> > > > 1) Java Serialisation is a very bad way of making objects persistent.
> > >
> > > Agreed!  There are just sooooooo many bad things about Java serialization...
> > 
> > Agreed up to a point.  It's certainly no panacea, but, to be fair,
> > it works pretty well for a lot of cases where you want short-term
> > persistance for data from simple programs.  (Hey, it's got me out
> > of trouble plenty of times...).  I'd like to see everything in
> > Java that /can/ reasonably be serialized marked as Serializable
> > (for a start, that allows distributed biojava apps using RMI).
> 
> absolutely - Java Serialisation is great for what it was intended,
> namely the painless, short-term reading/writing of (few) objects from/to
> a stream - like you need to do in RMI. It's absolutely no substitute for
> a database, though.

It could also provide a useful stopgap during migration from, eg flatfile
formats, to a relational database

> > 
> > Of course, this isn't a reason for not developing more sophisticated
> > persistance mechanisms for the cases where they're more appropriate.
> > 
> > > You didn't discuss using XML representations of  biojava objects.  That
> > > might offer a reasonable way to allow a wide variety of types of user to
> > > exploit biojava. Once you have the XML you can do what you like with it...
> > 
> > XML is probably my preferred method for a lot of long-term/cross-application
> > persistance functions.  BioJava is already using XML a little bit:
> > take a look at the XmlMarkovModel class.  I expect this will grow,
> > but when there isn't an existing XML grammer which fulfils the
> > requirements of a particular BioJava object, a bit of care is needed
> > to create a new grammer which will be widely accepted.
> 
> Regardless of all the niceties of XML, DTDs and handling DOMs from Java,
> we have to face the fact that XML is in essence just another way of
> defining flat file formats - fancy, easy-to-use file formats, granted,
> but flat files nevertheless. As such, an XML representation of an object
> graph suffers from many of the same drawbacks that other flat file
> representations suffer from (especially in contrast to a database
> representation of the same object graph): no datatypes (everything is a
> string); no transaction safety (isolation of access); no query
> capabilities against the data; ...

Actually, your last point isn't quite true - what about XQL? Of course, it
is far better to have your data in a genuine rdb and query using SQL.

> Additionally, XML representations tend to be verbous - so you need to
> compress on the fly.

This can be a serious limitation, especially if you are only interested in
one piece of information buried in a huge xml document.

> All this makes XML IMHO a very nice vehicle for the transient, portable,
> platform-neutral representation of data (e.g. for database
> import/export) but makes the idea of building a datastore of objects in

This is definitely a very strong strength of a good xml representation.
Within FlyBase, we use GAME XML for exhancging data between Celera,
GenBank, and different databases within FlyBase.

> XML not really much more attractive than it would be in any other flat
> file format.
> 
> Oh yes: and just as it is tedious to convert Java objects to/from a
> relational database representation, it is tedious to convert Java
> objects to/from an XML representation...

Hmmm, I find this kind of thing quite fun, if done within the correct
architecture. The XML part is definitely easier that the o/r part.

Having hand-crafted my own object-relational mappings for ~50 perl objects
onto 2 different dbs I would say there are definitely advantages in
rolling your own over choosing on OTS solution. I think you need full
control over the mapping to make it as efficient and flexible as possible.
The last time i looked at the commercial o/r products out there for java i
found them all seriously lacking, but that was over a year ago. 

I'm now thinking of implementing a similar set of o/r mappings for our
java objects (non biojava compliant) [the current middleware is purely
xml]. If there was an open source starting point to save us reinventing
the wheel, that would be great. A previous poster mentioned a 'broker'
architecture. This sounded v similar to the architecture I used in our
object-rel code.

> On the other hand - the world would be a better place if we just had
> GenBank in an XML representation based on a good DTD (-:

Well, the drosophila genome annotations were/are being submitted to
GenBank in GAME XML format. They are using an out of date version of the
DTD, but still, there may be the possibility they will decide to make
their data available in this format as well

> 	all the best,
> 	gerald
> 
> > 
> > Thomas.
> > --
> > There are whose study is of smells
> > And to attentive schools rehearse
> > How something mixed with something else
> > Makes something worse.
> 
> -- 
>    Gerald.Loeffler@vienna.at _________________ Software Architect
>    http://www.imp.univie.ac.at ____ http://www.daemonstration.com
>    OOA&D, Java, J2EE, JSP, Servlets, JavaBeans, ODBMS, RDBMS, XML
> 
> 
> 
> --__--__--
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 
> 
> End of Biojava-l Digest
>