[DAS2] Notes from the weekly DAS/2 teleconference, 4 Dec 2006

Steve Chervitz Steve_Chervitz at affymetrix.com
Mon Dec 4 18:50:16 UTC 2006

Notes from the weekly DAS/2 teleconference, 4 Dec 2006

$Id: das2-teleconf-2006-12-04.txt,v 1.1 2006/12/04 18:47:10 sac Exp $

Teleconference Info:
   * Schedule:         Biweekly on Monday
   * Time of Day:      9:30 AM PST, 17:30 GMT
   * Dialin (US):      800-531-3250
   * Dialin (Intl):    303-928-2693
   * Toll-free UK:     08 00 40 49 467
   * Toll-free France: 08 00 907 839
   * Conference ID:    2879055
   * Passcode:         1365

Note taker: Steve Chervitz

  Affy: Steve Chervitz, Gregg Helt
  CSHL: Lincoln Stein
  Dalke Scientific: Andrew Dalke

Action items are flagged with '[A]'.

These notes are checked into the biodas.org CVS repository at
das/das2/notes/2006. Instructions on how to access this
repository are at http://biodas.org

The note taker aims for completeness and accuracy, but these goals are
not always achievable, given the desire to get the notes out with a
rapid turnaround. So don't consider these notes as complete minutes
from the meeting, but rather abbreviated, summarized versions of what
was discussed. There may be errors of commission and omission.
Participants are welcome to post comments and/or corrections to these
as they see fit. 

 * HTML retrieval spec discussion
 * Status reports

Topic: HTML spec finalization

gh: has everyone had a chance to check out the revised html version of
the retrieval spec since steve's changes?

ad: looks clean
sc: still some XXX comments here and there.

[A] Gregg will add more alignment examples to html get spec, cigar string

[A] All will take a final look at html get spec, paying attention to XXX

gh: need to spend another day editing. It's a good sign that no one has felt
the need to change anything.

gh: now we have other docs to do: writeback, stylesheets, etc.

gh: finished the funding note for cshl sent to lincoln.

ls: allen will be able to start again in a few weeks.  cannot make any
obligations to people now unless I can show there is money for it. had
to ask him to stop working immediately.

Topic: status reports

ls: (re brian gilman, hapmap). Brian submitted a das2 pluging for
caCORE and a patch to the NCICB to allow caCORE to use his plugin. has
bee problematic b/c they wanted it in time for their releasse, and
brian could not get their system to build for about 1 month. got code
in by our deadline, but not in time for their release. uncertain when
NCI will do a point release to bring this code in. people need to d/l
the NCI source code, apply the diff, and re-compile it, which is not
trivial as their build system is quite complex. so the code is there
in principle, useable in practice? now he's working on das2 servers
for hapmap and vert promoter db. has the data, using allens
biopackages server, data should come up soon.

I suspect they will reject what he did. he sent them uml docs and
names of external libs, then started working on code, then chief s/w
guy at NCI said they wrote the plugin layer based on brian's
docs. once brian got the thing to compile, he realized it didn't
work. so it's been tough working with NCI s/w devs, they are annoyed
at us given out delay. doesn't impede das2 sources, but impedes the
ability of this highly visible toolkit to use das2.

gh: anything we can do to encourage?

ls: we'll see in a few days the reaction to brian's work. complication
- caBIG coordinator has left, new guy in place. possible a note from
Tom Gingeras would help.
gh: definitely. The primary way to look at affy tiling data is via
IGB. It's important to be able to view hapmap data within IGB.

ls: The other way around is important as well: for the core caBIG to have
access to tiling data, they need the das2 client layer.
gh: can get something from tom on that too. It's on the agenda for the
affy server to server up tiling data eventually.

[A] Gregg get letter from Tom Gingeras to support das2 in NCI caCORE

ls: update on perl das2 client - still where I left it after last code
sprint. needs 3-4 days of work. will go higher in priority when hapmap
and vert prom db are up, for access to that data.

gh: new IGB release over thanksgiving break out on 11/27, (Ed E. and I).
Includes das/2 fixes and some new things: using das/2 to pull in
data for affy chip data.
Some background: to generate results for affy expr and exon
chips is 'expression console' that generates results in CHP format,
which IGB can read, problem is that it has no genomic locations, just
probe set ids and p-values from experiments. So now, when IGB loads a
chip file it finds the matching coord data via the netaffx das server,
merges based on ids, to show results has heat maps, or
graphs. integrates in nicely with das/2 client code in IGB. runs
through this optimizer, doesn't reload the data for that session. can
cache data for whole chromosome on your machine. uses alt file formats
to retrieve in an optimize binary format. lazy loading, only for the
chrm you are looking at. pretty happy as a good use of das/2
completely behind the scenes.

gh: update broke the caching system that IGB is using, data retrieve
via urls on local hard drive. now file names are too long using full
type/segment uri's in the das/2 queries. so my url-> filename
conversion got too long. using shortened versions. works for netaffx
server but not with biopackages server. working on a fix soon.

sc: are there java libraries to create a md5 checksum on the full long
gh: maybe, or I may have a way to map filename to integers. need to
investigate possible strategies.

gh: also did some fixes on the das2 server.

sc: updated the affy das servers to include the latest rat genome
assembly release (ucsc rn4, Nov 2004). added to our das/1, das/2,
and quickload servers. Added probe/probeset data for all exon arrays
to das/2 server. Fixed a bug in the exon array names to permit gregg's
genome location lookup tool to work.

gh: we need to map the chip type name (which we have no control over)
to the 'type' name in das2.

sc: affy das/2 server has just a subset of the genomes and
annotations available via quickload. das/1 server has support for
3' IVT array design data and exon arrays, not all arrays or genome
versions supported due to memory limitation on the machine.

gh: I'm hoping to get signoff for the new hardware order on wed.
A quad opteron 32g expandable to 64g, should be nice.

sc: also replied to brian osborne on discussion list re: some das1 vs
2 issues. Seems like a good candidate for a faq item. We should set up
a faq, ideally on the wikified version of biodas.org. No progress on
the wikification project. Need to poke open-bio.org admins again.

[A] steve will set up faq on biodas.org
[A] steve will look into wikification of biodas.org

ad: working on proxy for translating das2 type queries into das1-style
queries. on servers that are on andreas' registry. asked andreas about
issues from various servers that appear to be not working.

gh: can andreas detect non-operating servers via automatic server
ad: the segments doc on das1 text id is gene id 'located on chromsome
5' so a long string for segment id. valid xml but requires human to

ad: also taking people's das1 modifications, using my handwritten code
to apply their extensions, e.g., for ontologies. figuring out how to make
their adaptation work nicely. mostly just saying, "there was extra data,
you figure out how to use it."

ad: code for proxy is in dasypus sourceforge CVS. In two parts: manual
part goes to registry, updates local db. other part does proxying
of das1 system. haven't documented how that works.

[A] Andrew document das/1 proxy system in the faq (when faq is ready)

[A] Next meeting in two weeks (18 Dec 2006)

More information about the DAS2 mailing list