From Steve_Chervitz at affymetrix.com  Fri Feb  2 14:17:55 2007
From: Steve_Chervitz at affymetrix.com (Steve Chervitz)
Date: Fri, 02 Feb 2007 11:17:55 -0800
Subject: [DAS2] Notes from the biweekly DAS/2 teleconference, 22 Jan 2007
Message-ID: <C1E8CE63.24B84%Steve_Chervitz@affymetrix.com>

[These are from the teleconf from *last* week. Apologies for tardiness.
Next meeting is this coming Monday, 5 Feb.
DAS grant folks: Don't forget to write up your goals and timeline thru May!
-Steve]

Notes from the biweekly DAS/2 teleconference, 22 Jan 2007

$Id: das2-teleconf-2007-01-22.txt,v 1.1 2007/02/02 19:14:21 sac Exp $

Teleconference Info:
   * Schedule:         Biweekly on Monday
   * Time of Day:      9:30 AM PST, 17:30 GMT
   * Dialin (US):      800-531-3250
   * Dialin (Intl):    303-928-2693
   * Toll-free UK:     08 00 40 49 467
   * Toll-free France: 08 00 907 839
   * Conference ID:    2879055
   * Passcode:         1365

Attendees:
  Affy: Steve Chervitz, Ed Erwin, Gregg Helt
  UCLA: Allen Day

Note taker: Steve Chervitz

Action items are flagged with '[A]'.

These notes are checked into the biodas.org CVS repository at
das/das2/notes/. Instructions on how to access this
repository are at http://biodas.org

DISCLAIMER: 
The note taker aims for completeness and accuracy, but these goals are
not always achievable, given the desire to get the notes out with a
rapid turnaround. So don't consider these notes as complete minutes
from the meeting, but rather abbreviated, summarized versions of what
was discussed. There may be errors of commission and omission.
Participants are welcome to post comments and/or corrections to these
as they see fit. 


Agenda
-------
 * das/biosapiens meeting in hinxton
 * status reports

das/biosapiens mtg:
ee and gh: interested in attending. hang out with Andreas and other
das folks at Sanger about it.

gh: timelines and milestones for remainder of das/2 grant extension
(ending in May). what parts we are going to complete by then and
timeline to achieve. Publication: a paper should be submitted by end
of grant period. open access journals: PLoS or Biomed central. on the
affy side, publication on igb and underlying data models. highest
priority is a paper that focusses on the spec, uses examples

[A] everyone outline goals/timeline till end of May, due before next
teleconf

aday: yes, doing that today. I will be working
for the das2 grant for next two weeks full time for gregg. doing an extreme
re-factor, putting into a new framework to do writeback, block-level
caching, going through other docs, UML diagrams, emails, etc.

ee: will go over goals at Affy this week.

gh: general issue: forward from Suzanna re: ontology URI's, NCBO
starting to work on it to have their ontologies by addressable as
URIs. goog idea to be in contact with suzi or chris over there, join
in a meeting. 

aday: subscribes to that mailing list, no traffic on last week.

[A] Allen will ping NCBO folks again about URIs

gh: client UI, want to organize annotation types into something more than a
flat list, using an ontology, present a tree or DAG. now it seems the
ontologies are not sufficient to get the full graph I want. diff cell
lines, rna expression expts for some, chip-chip for some, methylation
for some. some experiments look at nuc rna vs cytoplasmic rna vs
polyA+, etc. not addressed by SO. looking at combinging annotation
field with ontology.

aday: other ontologies may be suitable biochemical molecule and
treatment.
gh: crossproduct ontologies, combinging terms from orthologous
ontologies, on the fly so you don't get combinatorial explosion.
aday: they have ideas about how to do that.
gh: right now das type ontology attrib has to be from SO.
aday: annotate affy at a sample level. many features in the
genome features space.  better to annotate the sample with the sample
treatment then point the features at it. put it as a label on the
genome track.

gh: now have this in DAS: type uri, type ontology, type title, type
method. features are typed.
aday: you want to put all features into same data source, but worried
about typing them differently.
gh: yes. could make them diff versioned sources, but can get
confusing.
aday: i'd do that. each cell line as a diff source.
gh: but trouble: crossproduct of genome assembly x cell line. need to
map to multiple assemblies.
gh: can use the title and/or method to break down things further

sc: allen, are you considering the mged ontology?
aday: doesn't mesh well with what obo has. it's verb oriented. describing
treatment, but participants are free text fields, uris.
sc: put uri's in the free text field?
aday: yes. can put, perterbation on cell line, drug name, conc, units,
time dration, units of time are all free text. for units you want to
draw from ontlogy (netCDF), for drug, accession from RxNorm, or other
ontology or reference to index it. they don't constrain that,
basically punting. not good integration.

gh: my issue is: how do you present this when there is alot of different
data on
server, present to user. search space interface. default=flat list of
types, but you can search on them. what I do for interface to
transcriptome db, tighter integration with their rdbms, allows
searching on all fields being displayed. for this case, cell lines
could be tagged with a property of cell location=nuc, cytosol,
both. then search based on those when you're trying to narrow down the
types you're looking at from the server.

gh: soon when i add graph data, i'll end up having hundreds of data
types. from das/1 server experience, eg at ucsc - tremendous amount of
data available as data types, several hundred, hard to find what you
want. but in the browser they are organizaed by categories,
alignments, etc. more navigable.

Other Topics:
---------------

sc: when can we announce the completion of genome retrieval spec?
gh: I need to add cigar string example. I' vote for mid Feb, in
advance of das/biosapiens meeting.

sc: other thing: move global seq ids into biodas.org wiki page and set
that wiki page up as main page for biodas.org.

[A] complete retrieval spec and wikification of biodas.org by mid-Feb

Status:
-----------

ee: igb manual. feedback welcome
gh: looking for gff directives for igb you can put into a gff file.
ee: not in manual. hidden from user for now. better to use our gff
parser.

gh: working on two things, and focus on das for next two
weeks. handling chip format data from affy expression console. das
related because those file formats lack genome location info. using
das to look up the info to merge into that data. using a simple
hierarchy of probesets and probe, but we need something more
sophisticated. looking at a 4 or 5-level hierarchy, now represented in
gff2, embedding hierarchy stuff in the tags.

ee: gff3 can represent the hierarchy. we didn't have a parser for it
back then.
gh: could insert that into the pipeline at some point.
my intent: a more efficient binary format.
would be a good test in das of multi-level hierarchies. das server
should then output it in a multi-level das2xml document. there hasn't
been an example of multi-level > 2 hierarchy yet (affy or
biopackages).
issues now is: how to render something that's >2 levels of hierarchy?
for now, just rendering the last two levels.

gh: second thing: support in affy server and igb to access graph data
via das/2. am amassing transcriptome data, no serious processing of it
yet. for each expt, a set of 90+ chips, 300mill data points, 16 expts
+ replicates. lots of data. getting affy das server to do smart
indexing of that data. server right now expects there to be 1 file for
a whole annotation type, but these graphs are too big, must be broken
by chromosome as well. Need to address this issue. Solution is in
sight.

sc: helping configure the affy das/2 server, working with gregg to
support transcript-level annotations for exon arrays (required for CHP
file support). biodas.org wikification, looking into moving the global
seq id page into biodas.org space. No new progress on page since Dec.

aday: gmod meeting in sd last wed, heard ucsc folk talk about genome
browser stuff, new things being added. have < 20Tb of data, but have
lots of hardware. we have more than that at our lab. we keep images,
all cel files, more recent dat files.
gh: anything else of gmod interest at gmod meeting?
aday: brian osborne was hired doing full time user support, starting by
doing documentation, 21 little projects as part of gmod, 9 selected as
having doc clean up, addition. das was select as one to be documented.
he'll send doc packages to the project site.
sc: yes he's been in contact with me and has offered to help.

Wrapup:
-------
gh: next meeting in two weeks 5 Feb 2007.

Everyone: send in goals/milestones before next meeting

aday: gregg, can we meet earlier than that?


From Steve_Chervitz at affymetrix.com  Mon Feb  5 14:09:13 2007
From: Steve_Chervitz at affymetrix.com (Steve Chervitz)
Date: Mon, 05 Feb 2007 11:09:13 -0800
Subject: [DAS2] Notes from the biweekly DAS/2 teleconference, 5 Feb 2007
Message-ID: <C1ECC0D9.24BFE%Steve_Chervitz@affymetrix.com>

Notes from the biweekly DAS/2 teleconference, 5 Feb 2007

$Id: das2-teleconf-2007-02-05.txt,v 1.1 2007/02/05 19:08:10 sac Exp $

Teleconference Info:
   * Schedule:         Biweekly on Monday
   * Time of Day:      9:30 AM PST, 17:30 GMT
   * Dialin (US):      800-531-3250
   * Dialin (Intl):    303-928-2693
   * Toll-free UK:     08 00 40 49 467
   * Toll-free France: 08 00 907 839
   * Conference ID:    2879055
   * Passcode:         1365

Attendees:
    Affy: Steve Chervitz, Ed Erwin, Gregg Helt
    CSHL: Lincoln Stein
  Sanger: Andreas Prlic
     UAB: Ann Loraine
    UCLA: Allen Day, Brian O'connor

Note taker: Steve Chervitz

Action items are flagged with '[A]'.

These notes are checked into the biodas.org CVS repository at
das/das2/notes/. Instructions on how to access this
repository are at http://biodas.org

DISCLAIMER: 
The note taker aims for completeness and accuracy, but these goals are
not always achievable, given the desire to get the notes out with a
rapid turnaround. So don't consider these notes as complete minutes
from the meeting, but rather abbreviated, summarized versions of what
was discussed. There may be errors of commission and omission.
Participants are welcome to post comments and/or corrections to these
as they see fit. 


Agenda
-------
 * Open discussion
 * Status reports
 * Overall plans, timelines from now till End of May (end of grant term)
 * Global genome assembly and sequence IDs, how they fit with registry

Open discussion
----------------
ap: biosapiens meeting this month, announcement sent to das/2 list
gh: Ed and I will be attending.

[A] Gregg/Ed meet w/ Andreas prior to biosapiens mtg, Sat during day.

al: anyone interested in setting up das for plants, starting with
arabidopsis, then rice and poplar?
ls: we have das/1 server for rice in Gramene.
al: didn't see it on public website, sent feedback, no response.  I
want to set something up to viz data in IGB. better to use das/2.
rice, annotations from Robin Buell's group and an Arizona group, a new
genome from moss in a few months. Two tiling array data sets for
Arabidopsis, Joe Ecker, and another based on Nimblegen. Everyone has
Arab v6, but some cDNA-to-genome alignments scattered in different
places. Good to capture in das. Hard to distribute as a giant file (as
I've done).

ls: Happy to send all data from Gramene if you want to set it up in
das/2. Rice annotations, cross-species alignment data. Poplar being
added. Can give it in Ensembl format (schema database files).
al: Interested.
bo: what distribution of linux? fc2 or 5 or gentoo, you can
install via rpms (prepared via biopackages.net). I can help.

gh: for tiling array data, working on support for that in affy server,
might be worth looking into.
al: timeline? something working in the next two months. Announce on
arab mailing list etc. access data programmatically via das and viz
via IGB, gBrowse, etc.
Want to publish it as well, so authorship is available.
Can compensate anyone who wants to help out.

Registry
-----------
al: would be great if more people would use it. Eg. can Gramene
register?
ls: fair enough.

[A] Lincoln will register Gramene das servers in the Sanger das registry

gh: regarding das meeting that Andreas is setting up.
ap: biosapiens, a european funded project. goal to annotate the human
genome. dedication to use das, mainly das/1, mainly protein
sequences. focus on people doing das client-side dev, want to invite
technical
folks, network, share ideas, find synergies between groups. So far 24
have registered.
ee: how are reservations, enough space accomodations?
ap: yes, rooms in hinxton conf center, working on writing
confirmations.
after client meeting, there will be a meeting on annotation type
ontology work, a project to standardize the annotations being
provided. I am syncronizing with this person (Henning).
Everyone who wants to can give a short presentation, 15', what
features are needed, etc.
gh: would love a session about das/2, addressing those needs, how well
does it address needs of people doing protein das.
ap: summarize new features, etc.

[A] Biosapiens mtg talks: Ed will talk about IGB, Gregg talk on das/2 spec

gh: what's useful is seeing whether das/2 things will meet needs of
protein das world, not too much focus on proteins. Esp mult seq alignment.
ap: we'll have people from pfam, jalview.
al: jalview is a very nice program, impressive work.
gh: some integration with Apollo (can use jalview to view msa), I
believe.
gh: useful for us to be a biosapiens-specific part of the meeting?
ap: i'll send you program
gh/ee: interested in what the biosapiens project is up to in general.

[A] Andreas will send gregg info about biosapiens meeting, program, contacts

Topic: Global seq ids
---------------------
gh: in das wiki pages we have a global seq ids page. summarizing what
std uri's for coord systems and sequences. wondering how to sync this
with registry coordinates.
ap: if wiki should be point of reference, and registry sucks this in
automatically. if someone breaks wiki with bad text, then breaks
registry.
gh: yes, but we want it to be editable so that users can add new
organisms. not sure the mechanism.
sc: allen day's wiki module. converts to dom. maybe?
aday: not a dom, but a latex format. there are some other modules out
there.
ap: uris are not linked to actual project doing the sequencing.
gh: we talked about a couple of weeks ago. not it's just text. "march
2006", then "uri". proposal: add "Here's the coordinates fragment you
should have in your das/2 xml request". Then the registry could parse
this document and just look for coordinates elements to see what's
defined here.
ap: yes, for das/2 registry, initially wrote code that is compatible
with das/1 and 2. Now it's hard to write code that can work with
both. need to re-write das/2-specific code. Thought there could be a
generic interface, but there are too many small differences.
gh: yes. lots of things that are more tighly defined in das/2. can see
why it would be difficult.
goal now: someone who's setting up a das/2 server can see what their
coord uri should look like. later goal: screen scraping, syncing with
registry. 

ee: is registry for das/2 auto generated?
ap: yes. coming from my code that works with both versions. but it
won't scale with all new features.

gh: lincoln said he'd put those snippets on the biodas seq ids
page. what's status?
ls: still willing to do this when ready.
sc: still planning to migrate the global seq ids from the open-bio.org
wiki into the new biodas.org wiki. simple matter of cutting and
pasting html and setting up a pointer from old page. Will focus on
finishing this week.

[A] Steve migrate global seq ids page from open-bio wiki to biodas wiki.

Goals and timelines
--------------------
[A] All: send goals and timelines thru end of May to the grant list:
das2grant at lists.open-bio.org

This will help us figure out where we can get to through end of grant
period.

Status:
--------
ee: ad end of March, I'm moving to a different project, won't have
lots of time post march for das/igb. my timeframe is therefore
shortened, unfortunate given my long list. will focus on making it
easy for others to add plugins to extend igb, interacting with igb via
http protocol, ensure igb uses std file formats for easier sharing
with other apps, includes stylesheets. Some other little things. Need
a bug fix release for igb soon. Would also like to make interaction
with das/2 registry better. Problem now: doesn't realize when two
genome versions are the same. Have been talking to Andreas about
this. 
will be available sporadically, perhaps.
unfortunate, since I've noticed that use of igb is going up rapidly,
in the past couple of months.

sc: if it keeps going up they'll have to put you back on igb!
ee: purely a budgetary issue, not because igb is considered
unimportant, but just a need to shift resources to new project
without hiring new devs.

gh: my das percentage will go up in March, focus on getting good paper
out on das/2 spec, submit to open access journals, PLoS, Biomed
central, etc. Have been sick for last week, so not much progress in last
week. sent out schedule for my goals on the das2grant list. Review:
 * additions to retrieval spec doc (html), diff kinds of features
 * affy chp file viz in igb, leveraging calls to das/2 to get genome
   locations for experimental results. slowly but steady.
 * merging quickload functionality into igb, completely via das/2 but
   hides UI
 * Bug fix release for igb
 * genometry and das/2 server - efficient retrieval of slices of
   data. that server is incomplete re: feature filters, bringing up to
   spec re: arbitrary combinations of feat filters.
 * biosapiens meeting
 * writeback impl focus in March
 * major igb release toward end of March
 * das/2 paper in April, submit in May.

sc: will paper focus on retrieval only?
gh: want to get writeback going in March, to include in paper. it's a
major part of the spec. stable, but untested.

sc: configuring affy das/2 public server, adding support for more
arrays and genome versions. Want to focus on streamlining pipeline
that keeps annotations up-to-date on affy das/2 server. Also, will
help as needed on supporting exon array features (and related
gene-level support). Working on wikifying biodas.org. Plan for this to
be completed this month. Andreas has helped here.

aday: was working on a manuscript, not as much das work done, but did
get env set up to start writing UML, stubbing out files. Am optimistic
now, lots of off the shelf components that make it easy for das/2
server package, eg. from biopackages, gbrowse, blat, blast. Shipping
with yeastgenome.org data file, bio::db::gff memory adapter to query
and serve, need to work on writing out das2xml.
gh: this will be a second backend
aday: a first backend for the current project

aday: adding binaries for seq searches, want to try dynamic features,
e.g., primer design, or submit a query, get back hits. all part of
same code base, couple this with existing chado backend, and writeback
code, unifying into a common code base. working on flowchart
diagram. still working on that, doable by end of grant period.
gh: can you break it down month by month?
aday: want to put two weeks solid on it, barring derailment by other
priorities. want also to participate on the publication. need to
consult with advisor on that, maybe part of my dissertation.
shooting for graduating this summer.
gh: great. planning to submit by then. there will be enough to say
about spec w/out getting into impl. could point at ref impls.
aday: in lieu of a das/2 publication, am referencing the biodas.org
site for a manuscript we're submitting soon. server with 60,000 array
result files, 20K are hg-u133.

gh: planning for all to contribute to das/2 ms. We can work out
detailed contribs later.

bo: working on graduating now (march 21st defense). full time work on
the grant after that, whatever is left on the refactor, packaging,
documentation from biopackages perspective. working with Mark Carlson
on das/2 igb client in another project. for time being, will be full
time focus on graduation. can do maybe up to a month of full-time
work. will still attend conf calls to keep tabs.

ap: several things: editing on biodas.org wiki page. needs more
work. protein structure das applications, casp prot structure
prediction, scop, collection of prot struc alignments, making web
pages via das, wrote paper on this. for Ensembl, it's making large amt
of data available, set up ~17 das sources, working on registration
server for that, allowing people to upload data as well.

gh: anyone actively working on das/2 dev at Sanger/EBI besides registry?
ap: don't know, I'm on a different grant.
gh: yes, the ball is in our court re: das/2, but haven't heard much
from the uk folks yet. andrew had idea on das1-das2 (proxy server)
ap: yes, will be very useful.
gh: then the registry will be able to list das2 as well as das1
servers.
ap: his status on that?
gh: he hasn't called in in a while, but that was is focus for
remainder of his contribution to grant.
sc: he was re-engineering to deal with some speed issues. haven't
heard latest status.

[A] gregg ask andrew about das1-das2 proxy status, mention at biosapiens mtg

Wrapup:
---------

[A] Everyone get their goals milestones to gregg ASAP (via the das2-grant
list)

[A] Next teleconf: 19 Feb


From aloraine at gmail.com  Tue Feb  6 09:40:01 2007
From: aloraine at gmail.com (Ann Loraine)
Date: Tue, 6 Feb 2007 08:40:01 -0600
Subject: [DAS2] biodas rpm
Message-ID: <83722dde0702060640t34284aebgc979414d4fe42afb@mail.gmail.com>

Hello,

I'm following up on Brian's pointer to the rpm he mentioned for
setting up a DAS service.

Please send me a pointer or link -- I'd like to try it out!

-Ann


-- 
Ann Loraine, Assistant Professor
Departments of Genetics, Biostatistics,
Computer and Information Sciences
Associate Scientist, Comprehensive Cancer Center
University of Alabama at Birmingham
http://www.transvar.org
205-996-4155


From Steve_Chervitz at affymetrix.com  Wed Feb 21 19:50:11 2007
From: Steve_Chervitz at affymetrix.com (Steve Chervitz)
Date: Wed, 21 Feb 2007 16:50:11 -0800
Subject: [DAS2] Global seq IDs page migrated
Message-ID: <C20228C3.25247%Steve_Chervitz@affymetrix.com>

The wiki page that tracks the official global sequence identifiers for use
within DAS has been migrated to it's new home within the emerging biodas.org
wiki site: 

http://biodas.org/wiki/GlobalSeqIDs

The former location ( http://www.open-bio.org/wiki/DAS:GlobalSeqIDs ) has
been updated to point to the new location.

Next step: Add das2xml snippets containing the full coordinate element for
each assembly identifier, which can be cut-n-pasted or screen scraped into a
das/2 document as desired. Lincoln agreed to do this when the page was
ready, as it now is.

Cheers,
Steve


From Steve_Chervitz at affymetrix.com  Tue Feb 27 15:03:32 2007
From: Steve_Chervitz at affymetrix.com (Steve Chervitz)
Date: Tue, 27 Feb 2007 12:03:32 -0800
Subject: [DAS2] Biodas.org wikification is complete
Message-ID: <C209CE94.253A9%Steve_Chervitz@affymetrix.com>

I'm happy to report that http://biodas.org has joined the ranks of other
open-bio.org projects and has been converted to a wiki format:
http://biodas.org will now resolve to the main page of the new wiki.

This new format should help keep it a more current and informative site.
Contributions are welcome from all members of the DAS community who are
motivated to help out.

Andreas Prlic and I have done our best to migrate content from the old site
to the wiki. Send a message to the list if there is anything you notice
missing and we can look into recovering it.

Steve


From Steve_Chervitz at affymetrix.com  Fri Feb  2 19:17:55 2007
From: Steve_Chervitz at affymetrix.com (Steve Chervitz)
Date: Fri, 02 Feb 2007 11:17:55 -0800
Subject: [DAS2] Notes from the biweekly DAS/2 teleconference, 22 Jan 2007
Message-ID: <C1E8CE63.24B84%Steve_Chervitz@affymetrix.com>

[These are from the teleconf from *last* week. Apologies for tardiness.
Next meeting is this coming Monday, 5 Feb.
DAS grant folks: Don't forget to write up your goals and timeline thru May!
-Steve]

Notes from the biweekly DAS/2 teleconference, 22 Jan 2007

$Id: das2-teleconf-2007-01-22.txt,v 1.1 2007/02/02 19:14:21 sac Exp $

Teleconference Info:
   * Schedule:         Biweekly on Monday
   * Time of Day:      9:30 AM PST, 17:30 GMT
   * Dialin (US):      800-531-3250
   * Dialin (Intl):    303-928-2693
   * Toll-free UK:     08 00 40 49 467
   * Toll-free France: 08 00 907 839
   * Conference ID:    2879055
   * Passcode:         1365

Attendees:
  Affy: Steve Chervitz, Ed Erwin, Gregg Helt
  UCLA: Allen Day

Note taker: Steve Chervitz

Action items are flagged with '[A]'.

These notes are checked into the biodas.org CVS repository at
das/das2/notes/. Instructions on how to access this
repository are at http://biodas.org

DISCLAIMER: 
The note taker aims for completeness and accuracy, but these goals are
not always achievable, given the desire to get the notes out with a
rapid turnaround. So don't consider these notes as complete minutes
from the meeting, but rather abbreviated, summarized versions of what
was discussed. There may be errors of commission and omission.
Participants are welcome to post comments and/or corrections to these
as they see fit. 


Agenda
-------
 * das/biosapiens meeting in hinxton
 * status reports

das/biosapiens mtg:
ee and gh: interested in attending. hang out with Andreas and other
das folks at Sanger about it.

gh: timelines and milestones for remainder of das/2 grant extension
(ending in May). what parts we are going to complete by then and
timeline to achieve. Publication: a paper should be submitted by end
of grant period. open access journals: PLoS or Biomed central. on the
affy side, publication on igb and underlying data models. highest
priority is a paper that focusses on the spec, uses examples

[A] everyone outline goals/timeline till end of May, due before next
teleconf

aday: yes, doing that today. I will be working
for the das2 grant for next two weeks full time for gregg. doing an extreme
re-factor, putting into a new framework to do writeback, block-level
caching, going through other docs, UML diagrams, emails, etc.

ee: will go over goals at Affy this week.

gh: general issue: forward from Suzanna re: ontology URI's, NCBO
starting to work on it to have their ontologies by addressable as
URIs. goog idea to be in contact with suzi or chris over there, join
in a meeting. 

aday: subscribes to that mailing list, no traffic on last week.

[A] Allen will ping NCBO folks again about URIs

gh: client UI, want to organize annotation types into something more than a
flat list, using an ontology, present a tree or DAG. now it seems the
ontologies are not sufficient to get the full graph I want. diff cell
lines, rna expression expts for some, chip-chip for some, methylation
for some. some experiments look at nuc rna vs cytoplasmic rna vs
polyA+, etc. not addressed by SO. looking at combinging annotation
field with ontology.

aday: other ontologies may be suitable biochemical molecule and
treatment.
gh: crossproduct ontologies, combinging terms from orthologous
ontologies, on the fly so you don't get combinatorial explosion.
aday: they have ideas about how to do that.
gh: right now das type ontology attrib has to be from SO.
aday: annotate affy at a sample level. many features in the
genome features space.  better to annotate the sample with the sample
treatment then point the features at it. put it as a label on the
genome track.

gh: now have this in DAS: type uri, type ontology, type title, type
method. features are typed.
aday: you want to put all features into same data source, but worried
about typing them differently.
gh: yes. could make them diff versioned sources, but can get
confusing.
aday: i'd do that. each cell line as a diff source.
gh: but trouble: crossproduct of genome assembly x cell line. need to
map to multiple assemblies.
gh: can use the title and/or method to break down things further

sc: allen, are you considering the mged ontology?
aday: doesn't mesh well with what obo has. it's verb oriented. describing
treatment, but participants are free text fields, uris.
sc: put uri's in the free text field?
aday: yes. can put, perterbation on cell line, drug name, conc, units,
time dration, units of time are all free text. for units you want to
draw from ontlogy (netCDF), for drug, accession from RxNorm, or other
ontology or reference to index it. they don't constrain that,
basically punting. not good integration.

gh: my issue is: how do you present this when there is alot of different
data on
server, present to user. search space interface. default=flat list of
types, but you can search on them. what I do for interface to
transcriptome db, tighter integration with their rdbms, allows
searching on all fields being displayed. for this case, cell lines
could be tagged with a property of cell location=nuc, cytosol,
both. then search based on those when you're trying to narrow down the
types you're looking at from the server.

gh: soon when i add graph data, i'll end up having hundreds of data
types. from das/1 server experience, eg at ucsc - tremendous amount of
data available as data types, several hundred, hard to find what you
want. but in the browser they are organizaed by categories,
alignments, etc. more navigable.

Other Topics:
---------------

sc: when can we announce the completion of genome retrieval spec?
gh: I need to add cigar string example. I' vote for mid Feb, in
advance of das/biosapiens meeting.

sc: other thing: move global seq ids into biodas.org wiki page and set
that wiki page up as main page for biodas.org.

[A] complete retrieval spec and wikification of biodas.org by mid-Feb

Status:
-----------

ee: igb manual. feedback welcome
gh: looking for gff directives for igb you can put into a gff file.
ee: not in manual. hidden from user for now. better to use our gff
parser.

gh: working on two things, and focus on das for next two
weeks. handling chip format data from affy expression console. das
related because those file formats lack genome location info. using
das to look up the info to merge into that data. using a simple
hierarchy of probesets and probe, but we need something more
sophisticated. looking at a 4 or 5-level hierarchy, now represented in
gff2, embedding hierarchy stuff in the tags.

ee: gff3 can represent the hierarchy. we didn't have a parser for it
back then.
gh: could insert that into the pipeline at some point.
my intent: a more efficient binary format.
would be a good test in das of multi-level hierarchies. das server
should then output it in a multi-level das2xml document. there hasn't
been an example of multi-level > 2 hierarchy yet (affy or
biopackages).
issues now is: how to render something that's >2 levels of hierarchy?
for now, just rendering the last two levels.

gh: second thing: support in affy server and igb to access graph data
via das/2. am amassing transcriptome data, no serious processing of it
yet. for each expt, a set of 90+ chips, 300mill data points, 16 expts
+ replicates. lots of data. getting affy das server to do smart
indexing of that data. server right now expects there to be 1 file for
a whole annotation type, but these graphs are too big, must be broken
by chromosome as well. Need to address this issue. Solution is in
sight.

sc: helping configure the affy das/2 server, working with gregg to
support transcript-level annotations for exon arrays (required for CHP
file support). biodas.org wikification, looking into moving the global
seq id page into biodas.org space. No new progress on page since Dec.

aday: gmod meeting in sd last wed, heard ucsc folk talk about genome
browser stuff, new things being added. have < 20Tb of data, but have
lots of hardware. we have more than that at our lab. we keep images,
all cel files, more recent dat files.
gh: anything else of gmod interest at gmod meeting?
aday: brian osborne was hired doing full time user support, starting by
doing documentation, 21 little projects as part of gmod, 9 selected as
having doc clean up, addition. das was select as one to be documented.
he'll send doc packages to the project site.
sc: yes he's been in contact with me and has offered to help.

Wrapup:
-------
gh: next meeting in two weeks 5 Feb 2007.

Everyone: send in goals/milestones before next meeting

aday: gregg, can we meet earlier than that?


From Steve_Chervitz at affymetrix.com  Mon Feb  5 19:09:13 2007
From: Steve_Chervitz at affymetrix.com (Steve Chervitz)
Date: Mon, 05 Feb 2007 11:09:13 -0800
Subject: [DAS2] Notes from the biweekly DAS/2 teleconference, 5 Feb 2007
Message-ID: <C1ECC0D9.24BFE%Steve_Chervitz@affymetrix.com>

Notes from the biweekly DAS/2 teleconference, 5 Feb 2007

$Id: das2-teleconf-2007-02-05.txt,v 1.1 2007/02/05 19:08:10 sac Exp $

Teleconference Info:
   * Schedule:         Biweekly on Monday
   * Time of Day:      9:30 AM PST, 17:30 GMT
   * Dialin (US):      800-531-3250
   * Dialin (Intl):    303-928-2693
   * Toll-free UK:     08 00 40 49 467
   * Toll-free France: 08 00 907 839
   * Conference ID:    2879055
   * Passcode:         1365

Attendees:
    Affy: Steve Chervitz, Ed Erwin, Gregg Helt
    CSHL: Lincoln Stein
  Sanger: Andreas Prlic
     UAB: Ann Loraine
    UCLA: Allen Day, Brian O'connor

Note taker: Steve Chervitz

Action items are flagged with '[A]'.

These notes are checked into the biodas.org CVS repository at
das/das2/notes/. Instructions on how to access this
repository are at http://biodas.org

DISCLAIMER: 
The note taker aims for completeness and accuracy, but these goals are
not always achievable, given the desire to get the notes out with a
rapid turnaround. So don't consider these notes as complete minutes
from the meeting, but rather abbreviated, summarized versions of what
was discussed. There may be errors of commission and omission.
Participants are welcome to post comments and/or corrections to these
as they see fit. 


Agenda
-------
 * Open discussion
 * Status reports
 * Overall plans, timelines from now till End of May (end of grant term)
 * Global genome assembly and sequence IDs, how they fit with registry

Open discussion
----------------
ap: biosapiens meeting this month, announcement sent to das/2 list
gh: Ed and I will be attending.

[A] Gregg/Ed meet w/ Andreas prior to biosapiens mtg, Sat during day.

al: anyone interested in setting up das for plants, starting with
arabidopsis, then rice and poplar?
ls: we have das/1 server for rice in Gramene.
al: didn't see it on public website, sent feedback, no response.  I
want to set something up to viz data in IGB. better to use das/2.
rice, annotations from Robin Buell's group and an Arizona group, a new
genome from moss in a few months. Two tiling array data sets for
Arabidopsis, Joe Ecker, and another based on Nimblegen. Everyone has
Arab v6, but some cDNA-to-genome alignments scattered in different
places. Good to capture in das. Hard to distribute as a giant file (as
I've done).

ls: Happy to send all data from Gramene if you want to set it up in
das/2. Rice annotations, cross-species alignment data. Poplar being
added. Can give it in Ensembl format (schema database files).
al: Interested.
bo: what distribution of linux? fc2 or 5 or gentoo, you can
install via rpms (prepared via biopackages.net). I can help.

gh: for tiling array data, working on support for that in affy server,
might be worth looking into.
al: timeline? something working in the next two months. Announce on
arab mailing list etc. access data programmatically via das and viz
via IGB, gBrowse, etc.
Want to publish it as well, so authorship is available.
Can compensate anyone who wants to help out.

Registry
-----------
al: would be great if more people would use it. Eg. can Gramene
register?
ls: fair enough.

[A] Lincoln will register Gramene das servers in the Sanger das registry

gh: regarding das meeting that Andreas is setting up.
ap: biosapiens, a european funded project. goal to annotate the human
genome. dedication to use das, mainly das/1, mainly protein
sequences. focus on people doing das client-side dev, want to invite
technical
folks, network, share ideas, find synergies between groups. So far 24
have registered.
ee: how are reservations, enough space accomodations?
ap: yes, rooms in hinxton conf center, working on writing
confirmations.
after client meeting, there will be a meeting on annotation type
ontology work, a project to standardize the annotations being
provided. I am syncronizing with this person (Henning).
Everyone who wants to can give a short presentation, 15', what
features are needed, etc.
gh: would love a session about das/2, addressing those needs, how well
does it address needs of people doing protein das.
ap: summarize new features, etc.

[A] Biosapiens mtg talks: Ed will talk about IGB, Gregg talk on das/2 spec

gh: what's useful is seeing whether das/2 things will meet needs of
protein das world, not too much focus on proteins. Esp mult seq alignment.
ap: we'll have people from pfam, jalview.
al: jalview is a very nice program, impressive work.
gh: some integration with Apollo (can use jalview to view msa), I
believe.
gh: useful for us to be a biosapiens-specific part of the meeting?
ap: i'll send you program
gh/ee: interested in what the biosapiens project is up to in general.

[A] Andreas will send gregg info about biosapiens meeting, program, contacts

Topic: Global seq ids
---------------------
gh: in das wiki pages we have a global seq ids page. summarizing what
std uri's for coord systems and sequences. wondering how to sync this
with registry coordinates.
ap: if wiki should be point of reference, and registry sucks this in
automatically. if someone breaks wiki with bad text, then breaks
registry.
gh: yes, but we want it to be editable so that users can add new
organisms. not sure the mechanism.
sc: allen day's wiki module. converts to dom. maybe?
aday: not a dom, but a latex format. there are some other modules out
there.
ap: uris are not linked to actual project doing the sequencing.
gh: we talked about a couple of weeks ago. not it's just text. "march
2006", then "uri". proposal: add "Here's the coordinates fragment you
should have in your das/2 xml request". Then the registry could parse
this document and just look for coordinates elements to see what's
defined here.
ap: yes, for das/2 registry, initially wrote code that is compatible
with das/1 and 2. Now it's hard to write code that can work with
both. need to re-write das/2-specific code. Thought there could be a
generic interface, but there are too many small differences.
gh: yes. lots of things that are more tighly defined in das/2. can see
why it would be difficult.
goal now: someone who's setting up a das/2 server can see what their
coord uri should look like. later goal: screen scraping, syncing with
registry. 

ee: is registry for das/2 auto generated?
ap: yes. coming from my code that works with both versions. but it
won't scale with all new features.

gh: lincoln said he'd put those snippets on the biodas seq ids
page. what's status?
ls: still willing to do this when ready.
sc: still planning to migrate the global seq ids from the open-bio.org
wiki into the new biodas.org wiki. simple matter of cutting and
pasting html and setting up a pointer from old page. Will focus on
finishing this week.

[A] Steve migrate global seq ids page from open-bio wiki to biodas wiki.

Goals and timelines
--------------------
[A] All: send goals and timelines thru end of May to the grant list:
das2grant at lists.open-bio.org

This will help us figure out where we can get to through end of grant
period.

Status:
--------
ee: ad end of March, I'm moving to a different project, won't have
lots of time post march for das/igb. my timeframe is therefore
shortened, unfortunate given my long list. will focus on making it
easy for others to add plugins to extend igb, interacting with igb via
http protocol, ensure igb uses std file formats for easier sharing
with other apps, includes stylesheets. Some other little things. Need
a bug fix release for igb soon. Would also like to make interaction
with das/2 registry better. Problem now: doesn't realize when two
genome versions are the same. Have been talking to Andreas about
this. 
will be available sporadically, perhaps.
unfortunate, since I've noticed that use of igb is going up rapidly,
in the past couple of months.

sc: if it keeps going up they'll have to put you back on igb!
ee: purely a budgetary issue, not because igb is considered
unimportant, but just a need to shift resources to new project
without hiring new devs.

gh: my das percentage will go up in March, focus on getting good paper
out on das/2 spec, submit to open access journals, PLoS, Biomed
central, etc. Have been sick for last week, so not much progress in last
week. sent out schedule for my goals on the das2grant list. Review:
 * additions to retrieval spec doc (html), diff kinds of features
 * affy chp file viz in igb, leveraging calls to das/2 to get genome
   locations for experimental results. slowly but steady.
 * merging quickload functionality into igb, completely via das/2 but
   hides UI
 * Bug fix release for igb
 * genometry and das/2 server - efficient retrieval of slices of
   data. that server is incomplete re: feature filters, bringing up to
   spec re: arbitrary combinations of feat filters.
 * biosapiens meeting
 * writeback impl focus in March
 * major igb release toward end of March
 * das/2 paper in April, submit in May.

sc: will paper focus on retrieval only?
gh: want to get writeback going in March, to include in paper. it's a
major part of the spec. stable, but untested.

sc: configuring affy das/2 public server, adding support for more
arrays and genome versions. Want to focus on streamlining pipeline
that keeps annotations up-to-date on affy das/2 server. Also, will
help as needed on supporting exon array features (and related
gene-level support). Working on wikifying biodas.org. Plan for this to
be completed this month. Andreas has helped here.

aday: was working on a manuscript, not as much das work done, but did
get env set up to start writing UML, stubbing out files. Am optimistic
now, lots of off the shelf components that make it easy for das/2
server package, eg. from biopackages, gbrowse, blat, blast. Shipping
with yeastgenome.org data file, bio::db::gff memory adapter to query
and serve, need to work on writing out das2xml.
gh: this will be a second backend
aday: a first backend for the current project

aday: adding binaries for seq searches, want to try dynamic features,
e.g., primer design, or submit a query, get back hits. all part of
same code base, couple this with existing chado backend, and writeback
code, unifying into a common code base. working on flowchart
diagram. still working on that, doable by end of grant period.
gh: can you break it down month by month?
aday: want to put two weeks solid on it, barring derailment by other
priorities. want also to participate on the publication. need to
consult with advisor on that, maybe part of my dissertation.
shooting for graduating this summer.
gh: great. planning to submit by then. there will be enough to say
about spec w/out getting into impl. could point at ref impls.
aday: in lieu of a das/2 publication, am referencing the biodas.org
site for a manuscript we're submitting soon. server with 60,000 array
result files, 20K are hg-u133.

gh: planning for all to contribute to das/2 ms. We can work out
detailed contribs later.

bo: working on graduating now (march 21st defense). full time work on
the grant after that, whatever is left on the refactor, packaging,
documentation from biopackages perspective. working with Mark Carlson
on das/2 igb client in another project. for time being, will be full
time focus on graduation. can do maybe up to a month of full-time
work. will still attend conf calls to keep tabs.

ap: several things: editing on biodas.org wiki page. needs more
work. protein structure das applications, casp prot structure
prediction, scop, collection of prot struc alignments, making web
pages via das, wrote paper on this. for Ensembl, it's making large amt
of data available, set up ~17 das sources, working on registration
server for that, allowing people to upload data as well.

gh: anyone actively working on das/2 dev at Sanger/EBI besides registry?
ap: don't know, I'm on a different grant.
gh: yes, the ball is in our court re: das/2, but haven't heard much
from the uk folks yet. andrew had idea on das1-das2 (proxy server)
ap: yes, will be very useful.
gh: then the registry will be able to list das2 as well as das1
servers.
ap: his status on that?
gh: he hasn't called in in a while, but that was is focus for
remainder of his contribution to grant.
sc: he was re-engineering to deal with some speed issues. haven't
heard latest status.

[A] gregg ask andrew about das1-das2 proxy status, mention at biosapiens mtg

Wrapup:
---------

[A] Everyone get their goals milestones to gregg ASAP (via the das2-grant
list)

[A] Next teleconf: 19 Feb


From aloraine at gmail.com  Tue Feb  6 14:40:01 2007
From: aloraine at gmail.com (Ann Loraine)
Date: Tue, 6 Feb 2007 08:40:01 -0600
Subject: [DAS2] biodas rpm
Message-ID: <83722dde0702060640t34284aebgc979414d4fe42afb@mail.gmail.com>

Hello,

I'm following up on Brian's pointer to the rpm he mentioned for
setting up a DAS service.

Please send me a pointer or link -- I'd like to try it out!

-Ann


-- 
Ann Loraine, Assistant Professor
Departments of Genetics, Biostatistics,
Computer and Information Sciences
Associate Scientist, Comprehensive Cancer Center
University of Alabama at Birmingham
http://www.transvar.org
205-996-4155


From Steve_Chervitz at affymetrix.com  Thu Feb 22 00:50:11 2007
From: Steve_Chervitz at affymetrix.com (Steve Chervitz)
Date: Wed, 21 Feb 2007 16:50:11 -0800
Subject: [DAS2] Global seq IDs page migrated
Message-ID: <C20228C3.25247%Steve_Chervitz@affymetrix.com>

The wiki page that tracks the official global sequence identifiers for use
within DAS has been migrated to it's new home within the emerging biodas.org
wiki site: 

http://biodas.org/wiki/GlobalSeqIDs

The former location ( http://www.open-bio.org/wiki/DAS:GlobalSeqIDs ) has
been updated to point to the new location.

Next step: Add das2xml snippets containing the full coordinate element for
each assembly identifier, which can be cut-n-pasted or screen scraped into a
das/2 document as desired. Lincoln agreed to do this when the page was
ready, as it now is.

Cheers,
Steve


From Steve_Chervitz at affymetrix.com  Tue Feb 27 20:03:32 2007
From: Steve_Chervitz at affymetrix.com (Steve Chervitz)
Date: Tue, 27 Feb 2007 12:03:32 -0800
Subject: [DAS2] Biodas.org wikification is complete
Message-ID: <C209CE94.253A9%Steve_Chervitz@affymetrix.com>

I'm happy to report that http://biodas.org has joined the ranks of other
open-bio.org projects and has been converted to a wiki format:
http://biodas.org will now resolve to the main page of the new wiki.

This new format should help keep it a more current and informative site.
Contributions are welcome from all members of the DAS community who are
motivated to help out.

Andreas Prlic and I have done our best to migrate content from the old site
to the wiki. Send a message to the list if there is anything you notice
missing and we can look into recovering it.

Steve