[Biojava-l] object repository

Bradley Marshall bradmars@yahoo.com
Fri, 5 May 2000 13:48:26 -0700 (PDT)


> > Of course, this isn't a reason for not developing
more sophisticated
> > persistance mechanisms for the cases where they're
more appropriate.
> > 
> > > You didn't discuss using XML representations of 
biojava objects.  That
> > > might offer a reasonable way to allow a wide
variety of types of user to
> > > exploit biojava. Once you have the XML you can
do what you like with it...
> > 
> > XML is probably my preferred method for a lot of
long-term/cross-application
> > persistance functions.  BioJava is already using
XML a little bit:
> > take a look at the XmlMarkovModel class.  I expect
this will grow,
> > but when there isn't an existing XML grammer which
fulfils the
> > requirements of a particular BioJava object, a bit
of care is needed
> > to create a new grammer which will be widely
accepted.
> 
> Regardless of all the niceties of XML, DTDs and
handling DOMs from Java,
> we have to face the fact that XML is in essence just
another way of
> defining flat file formats - fancy, easy-to-use file
formats, granted,
> but flat files nevertheless. As such, an XML
representation of an object
> graph suffers from many of the same drawbacks that
other flat file
> representations suffer from (especially in contrast
to a database
> representation of the same object graph): no
datatypes (everything is a
> string); no transaction safety (isolation of
access); no query
> capabilities against the data; ...

ACtually, I'm working on some code which should, in
time, be able to store an xml document in a relational
database and provide transaction safety and
queryability, at least eventually.  It remains to be
seen what the performance will be like, but I think
it'll be ok.  Initially, I'm writing it in python, but
it works by sending xml documents to a url, so really
it's language independant.  It could be easily ported
to java, however, and be given an api.

> Additionally, XML representations tend to be verbous
- so you need to
> compress on the fly.

>This can be a serious limitation, especially if you
are only interested in
>one piece of information buried in a huge xml
document.

I think it'll be easy to grab doc fragments (in my
database thing I was talking about, that is).  There's
no need to call up an entire xml document to get a
chunk :)

> All this makes XML IMHO a very nice vehicle for the
transient, portable,
> platform-neutral representation of data (e.g. for
database
> import/export) but makes the idea of building a
datastore of objects in
> XML not really much more attractive than it would be
in any other flat
> file format.

I don't agree with that.  While an xml db approach
isn't as nice as a direct db translation, it is
certainly superior to other flat formats in terms of
queryability and granularity.

One last point in xml's favor is that xsl is a very
nice way to convert between xml formats - without
programming even, so that changes in schema's/data
structures can be handled easily.

Brad

PS  I don't know if this has been mentioned, but there
are at least two open source pure java odbms's.  One
is at www.jdbms.org and the other is www.ozone-db.org
..  ozone also has some built-in xml support, although
the last I knew it was still pretty slow (months ago).
 These might be useful for small-medium sized data
repositories.  They're very easy to use.

__________________________________________________
Do You Yahoo!?
Send instant messages & get email alerts with Yahoo! Messenger.
http://im.yahoo.com/