[Biojava-l] BioSQL observations

Simon Brocklehurst simon.brocklehurst@CambridgeAntibody.com
Thu, 14 Mar 2002 14:37:46 +0000


Matthew Pocock wrote:

> If editing is needed, then you realy need to maintain version
> information to allow rollbacks, resolution of clashes between different
> edits, keeping views current and the rest. My first stab would be on the
> unique key for features being an (ID,version) tuple where versions are
> monotopicaly increasing integers, or a timestamp. (Do we need branching
> ala CVS?) Things should link through tables by ID, but the fetch query
> should usualy select id,max(version) - we would need nested queries to
> do this efficiently. Alternatively, you could have each new feature
> (including all features made by editing old features) have unique IDs,
> and store the edit history in a seperate edits table (things like,
> feature A was modified by user X to become feature B). Would these
> aproaches kill performance? Is this another case where the data model
> needs to be specified by something more loosely bound than object
> models, adaptor code or table definitions? My brain is melting.

This is a really rather tricky to do properly.   I think where you always
end up with this kind of thing is needing versioning not only for objects,
but also for "open containment" groups of related objects e.g. collections.

If this is the kind of thing you want to do, it may be worth looking at
WebDAV, and the development work going on in the IETG Delta-V Working
Group.  See http://www.webdav.org/deltav/

It might give you some ideas, at the very least.

In regards of performance, I'd say the motto "first make it work, then make
it fast" might apply.  Versioning over objects and advanced collections is
******** hard!

S.
--
Simon M. Brocklehurst, Ph.D.
Head of Bioinformatics & Advanced IS
Cambridge Antibody Technology
The Science Park, Melbourn, Cambridgeshire, UK
http://www.CambridgeAntibody.com/
mailto:simon.brocklehurst@CambridgeAntibody.com