[DAS2] Notes from the weekly DAS/2 teleconference, 21 Aug 2006

Steve Chervitz Steve_Chervitz at affymetrix.com
Mon Aug 21 21:42:30 UTC 2006


Notes from the weekly DAS/2 teleconference, 21 Aug 2006

$Id: das2-teleconf-2006-08-21.txt,v 1.1 2006/08/21 21:00:01 sac Exp $

Note taker: Steve Chervitz

Attendees: 
  Affy: Steve Chervitz, Gregg Helt
  CSHL: Lincoln Stein
  Dalke Scientific: Andrew Dalke
  UCLA: Allen Day, Brian O'Connor
        
Action items are flagged with '[A]'.

These notes are checked into the biodas.org CVS repository at
das/das2/notes/2006. Instructions on how to access this
repository are at http://biodas.org

DISCLAIMER: 
The note taker aims for completeness and accuracy, but these goals are
not always achievable, given the desire to get the notes out with a
rapid turnaround. So don't consider these notes as complete minutes
from the meeting, but rather abbreviated, summarized versions of what
was discussed. There may be errors of commission and omission.
Participants are welcome to post comments and/or corrections to these
as they see fit. 

Agenda:
--------
Summarize progress during last week's code sprint and discuss the few
remaining spec issues.


Topic: Spec Discussion
----------------------

[The note taker apologizes for attending late (~30min)]

gh: could a server in the types doc restrict the types. just say
'transcripts'?
ls: yes. if not going to allow for searching for feature, only via
parent, then types doc should only include parent.

gh: types doc specifies which types you can query on.
ls: ontology gives you access to all types that might come back
ad: and how to depict them.
gh: yes, but it can be restrictive of the types.
ad: what does client do to display it?
gh: implies we separate out style into stylesheet info again.
no one is serving or using, so we can change w/o major impl changes.
ad: type doc ties a feature to ontology, how to display it, and
includes this extra source field.
gh: types doc has all types server contains but tags as to what the
server allows searching on.

ad: feels weird. can't see why i'd want to do in my server.
bo: better than limiting the types doc, just have a searchable field.
ad: easy
gh: if you don't say no, then it's searchable. this is backwards
compatible. 

gh: other thing: for my optimization on client to work, need hint about
particular type on a server can have children outside bounds of
parent. or need the opposite: that all children are guaranteed to be
within bounds. 
ad: can't see why this is needed.
gh: can you trust me on it?
ad: no. 
going back to the case where you ask for introns in this range and you
want to return back everything.
gh: the reason i need it: if children are outside bounds of parents,
and i do query on parent, i never know if I'll get children outside
the bounds i specified. messes up my optimization.
ad: it will give you the children.
gh: i want to optimize so that i don't have to get that back.
i want assurance that there won't be something hanging off the region
in the query.
that there won't be anything outside the range that I queried.
ls: that's always the case. you can do the query that somethings are
outside the region you requested. you can filter things out.
gh: i don't want server to send them.
ls: semantically correct to always send the complete object.
gh: there are optimizations on client that depend on it.
ls: this will give you back more than you want.
gh: i don't want to have to it filter out (defeats optimization).

ad: range search for id=abc. you'll get all features in feat group id=abc.
ad: modify servers
gh: not so easy for servers I don't control.
ls: you won't be able to convince worm or microbial communities which have
features with different locations, some that are in trans on different
chromosomes.
gh: blat, blast, genscan, etc. majority of algorithmic seq will meet
that condition.
ls: if you feel comfortable going thru SO and flagging all features
that meet that requirement, we can add to SO and you can use it in
your optimization.
gh: not necessary to modify SO. no blat, blast in SO
ls: yes there are: computational matches
gh: not all comp matches
ls: we do have blast, you can add blat.
ad: type ontology has extension area, you can add that.
gh: no one will live with that.

gh: will try on my server, see how it works.

[A] Gregg will try flagging types on server, see if works with client
optimizations

ls: i have to go.

gh: this will change all impls, could be trouble.
ad: why does it change server impl at all?
gh: where filter range applies only to nodes that meet the type
filter.
ls: that's the way it's in the spec now.
ad: for any filter

aday: if you match a range it's root feature that matches range, can
reduce overhead by factor of 10-20.
ls: 
aday: won't trigger range query because type doesn't match
ls: searching over range
you'd pick up exon because it's contained in the range.

ad: if your server or allen's decides to model all stuctures by your
logic, it won't work. there are occasions where you will have non
overlapping impls in the server.
gh: your right.
to allen: does this affect your server impl?

gh: proposal is to clarify spec to say
that range queries apply only to the nodes of a feat group that pass
the types filter.

ad: range and non-range filters must both be true for a given feature

gh: ok, as long as we  can say in types doc that some types are not
filtered.

aday:
gh: if searching for types=exon in range that's in the intron,
gh: exon 1 in feature group, if it's outside range.

aday: this is the way I've impl'd:
find things in range, see if they match, then look for other things in
other filters match.
all filters operate on feature granularity, except range that operate
on feature group granularity.
all parents are located and encompass min/max bounds of encompassed
features.

gh: you get more things passing range query, but they get stopped by
name, or type, or id query.

ad: i'm happy with it.
gh: i'm not but will go along.

bo: have to leave now.

[A] andrew will clarify range and type filtering logic in the spec

[A] andrew will introduce concept of feature group
(currently in spec as 'complex feat with children')

[A] andrew will add searchable flag to type document

[A] andrew will add optional circularity flag to segments document

gh: Something we need todo: come back to stylesheet issue.
ad: we should have impl in place before making spec work.

[A] Discuss stylesheets when we have an impl in place


Topic: Summarize code sprint work
----------------------------------

Focus on what people did last Friday (last day of sprint).

gh: more complete write back on client. sync data model with how
writeback is working: delete feat group, add back with change. then
hit wall where that triggers issues in how to deal with undo/redo in
client.  I then did a massive chart on wall for how to deal with, now
have a clear path forward.

ad: Here's another issue: xml:base in writeback doc and how it
interacts with extensions. server may not know extension is to be
supported in writeback doc. e.g., link to image url. if xml:base in
writeback doc, then you have to make sure the context of the extension
that may have relative urls still preserver xml:base. seems ugly.
do we say servers are free to ignore xml:base?
gh: they should preserve it.
ad: so if my writeback doc says features is
http://biodas.org. feautures has a different one, individual features
have different ones, and extensions has a different one. my impl would
ignore xml:base in the data. (too complex to explain...)

[A] Andrew will describe his xml:base issue with writeback and send email

ad: worked on getting search algorithm to work. came up with counter
examples re: parent element containing/not containing children.

sc: mostly worked on notes and catching up with mailing list. Some
todo items:

[A] Steve Verify with Ann about new dm2-based affy das server data.
[A] Steve Finish info page for data hosted by affy das servers.
[A] Steve Update affy das/2 server to test new binary exon data (bp2)
[A] Steve Add id to wiki page for new drosophila assembly (R5)

aday: working on getting block translation server up and
running. close. code to automatically set up caching and staling out
the blocks. geting binary set up for onlth fly analysis
servers. primer3, ncbi ePCR, blat, blast binaries on server. now need
to install blat/blast dbs, can start serving up analyses.

ee: [not present, but heard from Ed after meeting] - continuing work
on gff3 parser for IGB client.

[A] Next teleconf in two weeks (4 Sep 2006)

gh: we had a successful sprint, hashed out critical decisison in the
spec, got a lot of work done.

[A] Next code sprint in Healdsburg at Helt Retreat Center.
Possible date? Not until end of year or begin of next year (lots of
construction in town).




More information about the DAS2 mailing list