[DAS2] Notes from the weekly DAS/2 teleconference, 27 Mar 2006

Mon Mar 27 19:05:28 UTC 2006

Notes from the weekly DAS/2 teleconference, 27 Mar 2006

$Id: das2-teleconf-2006-03-27.txt,v 1.1 2006/03/27 19:03:30 sac Exp $

Note taker: Steve Chervitz

Attendees: 
  Affy: Steve Chervitz, Gregg Helt
  CSHL: Lincoln Stein
  Dalke Scientific: Andrew Dalke
  UC Berkeley: Nomi Harris
  UCLA: Allen Day 

Action items are flagged with '[A]'.

These notes are checked into the biodas.org CVS repository at
das/das2/notes/2006. Instructions on how to access this
repository are at http://biodas.org

DISCLAIMER: 
The note taker aims for completeness and accuracy, but these goals are
not always achievable, given the desire to get the notes out with a
rapid turnaround. So don't consider these notes as complete minutes
from the meeting, but rather abbreviated, summarized versions of what
was discussed. There may be errors of commission and omission.
Participants are welcome to post comments and/or corrections to these
as they see fit. 

Proposed agenda:
 * Code sprint summary
 * DAS/2 grant status
 * Writeback spec & implementation

[Notetaker: missed the first 40min - apologies]

Topic: Code sprint summary
--------------------------

gh: pleased with our progress during the last code sprint (13-17 Mar)

[Notetaker: detailed summaries of what folks did during this code sprint
are described here:
http://lists.open-bio.org/pipermail/das2/2006-March/000668.html ]

Topic: Writeback 
----------------

[Discussion in progress]

ls: in my model, every feature has a unique id, when you update it,
it's going to make the change to the object and not create a new one.
the object is associated with url in some way, when you update the
position of this exon, it's going to change some attributes of it.

gh: thomas proposed the alternative: every time you change a feature
you create a new one with a pointer back to the old one.

ad: can't speak for what db implementers will do for versioning of
features. only taking about merging from different complex
features. So only when you merge from complex ones.

ls: this is the history tracking business. writeback will explicitly
support merges and splits.
ad: how detailed does the spec need to be?
ls: driven by requirements.
ad: what are the reqts? I can't go further without more details. roy
said eevery modification gets new version, so you could do time
travel, if your db supported that.

ls: does igb or apollo explicitly support merges and splits among
transcripts?
gh: yes. curation in igb is experimental (now turned off). but it does
support these. as does apollo. so these are essential.
ls: writeback should have instructions for how feature will adopt
children of a subfeature. one feature adopts children of the other and
previous feature is now deprecated. there's a specific set
of operations for creating new features, renaming, spliting, and merging.
perhaps Nomi should write down what operations that apollo supports.

nh: yes, all those are supported as well as things like adjusting
endpoints of start of translation.
apollo can merge transcripts within a gene and between genes
(which offers to merge the associated genes). curators can do
'splurge' - a split, merge combo.
ls: that sounds like suzi's nomenclature.

gh: the db that apollo writes back to, do changes create new versions
of feature or change the feature itself?
nh: not sure. mark did the work with chado. I know they were doing
something to rewrite the entire feature if anything changed.

[A] nomi will ask Mark to join in discussion next week (3 April).

aday: what fraction of the operations are doing simple vs complex
things? eg., revising the gene model.
nh: revision happens a lot. mostly adjusting endpoints. splits and
merges are infrequent. adding annotation. But it doesn't matter how
infrequent the operations are, we either support them or we don't.

ad: when there are changes in the model, how does the client get
notified that the change occurred?
nh: that's tricky.
gh: this is outside the scope of the das/2 spec itself. as long as we
have locks to prevent simultaneous modification, that is
sufficient.

ad: there's no mechanism for polling server.
gh: yes, just requery server.
gh: but your client doesn't do it.
gh: I'm thinking of adding polling to get the last modified stuff.
For now, one can simply re-start your session to see what has changed.

aday: is the portion of writeback spec for modifying endpoints, simple
add/delete of annotations stable?
ad: the general idea is unchanged.

gh: priority here is before next meeting: brian and allen read over
writeback spec and identify any issues as implementers.
aday: looking for an 80% solution. not dealing with heritance wihich
is difficult. 
nh: splits and merges can be done with combos of simpler ops.

aday: performace operations will be affected. graph flattening and
partial indexes. splits and merges will affect this table, so will
have to trigger update of that table any time there's a
split/merge. this will have big impact on query performance: could be
1-2 sec for yeast, 30-60 min for human.

gh: what about if you do that update 1x/day? Then users would be
working off a snapshot that was current as of the end of previous
day. 
aday: caching on server responses will also be affected, unless we
turn caching off. maybe I can tell apache to remove a subset of cached
pages and leave others intact.

aday: for tiling requests - server could find affected blocks and
purge those, instead of purging the entire cache.
gh: you can't rely on any client to use your tiling strategy. but
could be helpful for those clients that use it.
aday: basically we'll have to turn caching off when we start doing
writeback.
gh: is there a way for server to detect what has changed?
gh: if database detects change it can flush cache for that sequence.
aday: maybe. possibly the easiest way to do this is via tiling.

gh: say you have two servers:
   1) everthing that can be edited
   2) everything that has been edited (slower)
aday: main server has all features and second server handles
writeback, just writes to gff file, then cron runs once a night to
merge the gff into the db.

gh: separate dbs: 1) curation  2) everything that has been edited.
aday: yes. persistent flat file adapter can be used for one of them.
gh: this is the sort of detail I'm looking for w/r/t development of
the writeback spec.

[A] allen and brian look over writeback spec to discuss on 3 April.