[DAS2] Notes from the biweekly DAS/2 teleconference, 14 May 2007

Fri Jun 8 17:42:07 UTC 2007

Notes from the biweekly DAS/2 teleconference, 14 May 2007

$Id: das2-teleconf-2007-05-14.txt,v 1.1 2007/06/08 17:41:21 sac Exp $

Teleconference Info:
   * Schedule:         Biweekly on Monday
   * Time of Day:      9:30 AM PST, 17:30 GMT
   * Dialin (US):      800-531-3250
   * Dialin (Intl):    303-928-2693
   * Toll-free UK:     08 00 40 49 467
   * Toll-free France: 08 00 907 839
   * Conference ID:    2879055
   * Passcode:         1365

Attendees: 
    Affy: Steve Chervitz, Gregg Helt

Note taker: Steve Chervitz

Action items are flagged with '[A]'.

The teleconference schedule and links to past minutes are now
available from the Community Portal section of the biodas.org site:
http://www.biodas.org/wiki/BioDAS:Community_Portal

Meeting notes are checked into the biodas.org CVS repository at
das/das2/notes/. Instructions on how to access the DAS/2 CVS
repository are at http://www.biodas.org/wiki/DAS/2#CVS_Access

DISCLAIMER: 
The note taker aims for completeness and accuracy, but these goals are
not always achievable, given the desire to get the notes out with a
rapid turnaround. So don't consider these notes as complete minutes
from the meeting, but rather abbreviated, summarized versions of what
was discussed. There may be errors of commission and omission.
Participants are welcome to post comments and/or corrections to these
as they see fit. 

Agenda
-------
 * General discussion
 * Status updates

Topic: Status updates
---------------------

sc: Finished configuring the new machine that will host the Affy
public DAS servers. Migrated data from current box, set up local
mirrors for affy data and public data on internal machines to enable
easy syncing with the box when it moves to the colo.

Need to investigate potential issues with the DAS server and
affymetrix.com traffic. There has been a spate of dropped downloads
lately (e.g., Netaffx annotation files). If DAS traffic has been high,
this could be a possible source of trouble.
[Note added post facto: DAS traffic was not the issue. It was a
client-side problem.]

Now that we have memory, we can add support for other arrays and
organisms that have UCSC genome assemblies (cow, dog, chicken, rhesus,
zebrafish, anopheles).

[A] Steve add support on Affy das server for other affy arrays and orgs with
UCSC genomes

Have also been considering how to support organisms with non-UCSC
hosted assembled genomes for which we have arrays (e.g., arabidopsis,
rice, poplar) and for other orgs like drosophila where we use an
assembly not supported by UCSC (drosophila flybase R5.x, for example).

Possible plan: We can we generate psl files for the annotations we
have and then convert into a format igb can read (bps).
The transcriptome group has interest in this for the mod-encode work,
targetting Drosophila release 5.

[A] Steve add support on Affy das server for arrays and orgs with non-UCSC
genomes

gh: das2 code modificatios. have das2 server on internal affy server
serving up graph slices for transcriptome tiling array data. practice
for public version. Has been up for 1.5 weeks, so far so good. Clever
indexing on server to retrieve graph slices out of graph files, mods
on igb side to stitch slices together to make them look like one graph
in igb.  Dynamic type instantiation. request for graph slice, needs to
request something that server doesn't yet know about. Names are
initially randomly generated -- based on file name. Not meaningful for
display. transcriptome db stores where file resides, igb has a plugin
tie to transcriptome db, looks up file, tells server where to go to
get it.

Public server will have structure for where file resides. das2 server
will determine typename for each graph based on directory names, will
support types query.

same idea for dir structure as now for genome annotations.
some tricks getting server to view a set of files as one type.

don't load into memory, just indexes files. should support many graphs
before it impacts memory.

Also have worked on overhaul of GUI in igb for selecting/loading type
information. planning phase now. previously - panel to navigate genome
choice, then switches genome view to that source, then can select the
available types based on current view or whole sequence. messy because
1) you always see all possible genomes you can select. -
confusing. and 2) types available was only per version, awkward.

genome selection is a separate UI, pop up tree you can navigate. tree
does not show all servers, sources, versions that igb knows about, it
will filter to show only leafs that appropriate to the version you are
looking at. So it's a filtered view, any source user could pick is
appropriate to the selection. types displayed below, can select
checkboxes for what you're interested in.

Table view panel will look across all versioned sources for the
particular version you are looking at.

E.g. versioned source from affy, hapmap, biopackages for may 2004
human, you will see types for all versions.  Current way doesn't lend
itself to overlay annotations in the same view.

Other work: less about UI, working on getting IGB to retain more prefs
across sessions. IGB should remember all types you selected in current
session, remembers, loads. Now it requires you to select versioned
sources individually.  One big refresh data button that will trigger
loading of all types from every genome based on current
view. Universal way to reload data based on current view.

Topic: Next meeting
-------------------

28 May 2007  (this is a US holiday, attendance optional)