[DAS2] DAS/2 weekly meeting notes for 14 Nov 05

Steve Chervitz Steve_Chervitz at affymetrix.com
Mon Nov 14 19:29:22 UTC 2005


Notes from the weekly DAS/2 teleconference, 14 Nov 2005.

$Id: das2-teleconf-2005-11-14.txt,v 1.2 2005/11/14 19:20:37 sac Exp $

Attendees: 
  Affy: Steve Chervitz, Gregg Helt
  CSHL: Lincoln Stein
  UCBerkeley: Suzi Lewis
  Sweden: Andrew Dalke
        
Action items are flagged with '[A]'.

These notes are checked into the biodas.org CVS repository at
das/das2/notes/2005. Instructions on how to access this
repository are at http://biodas.org

----------------------------------
AD talked with A. Prlic about registry service, we want to incorporate
what he needs within DAS/2.

What they have:
- name (a few words) - for display of das track
- title, description (paragraph)  - synopsis
- url for more info

we have desc, id, doc_href, taxon
Therefore, we need name attribute
Need :
- name (mandatory)   (done - LS: adding it to spec now)
- desc (optional)

Coord system reg server:
* in das/2 - it's not optional (0 interbase)
* they find this important

We have confusion between assembly and reference server
LS: Need URI that points to assembly, independent of the
reference server. 
GH: Would like to have annot servers that don't know anything about
the ref server.

LS: Could use the region URI to ID the assembly
das/genome/sourceid/region = assembly id/uri

GH: The trouble is that NCBI is a ref source for many assemblies, yet
they lack a das sever. They have no URI.
LS: we can just make one up, or use most appropriate web page

LS: When you request versioned source from a server, it should say what
assembly coords it's working on and give a uri for that. In this case
there's no guarantee you can do a 'get' on that URI.
We want to say:
1- what is unique uri for assembly (everyone agrees to share this)
2- das URL for how to fetch it (some server's region url - trusted,
faithful copy with what is at ncbi). Diff servers could assert that
you can fetch it from various places.

GH: assembly could be an attribute since there'd be only one.
A list of ref servers that serve up that dna.

LS: in versioned source response. new section between capabilities and
namespaces called 'reference_sources'. Add 'assembly' attribute to
version element:
<version
   id=
   desc=
   assembly="" uri that describes assembly - mandatory
   
<reference_sources
   - tells you where to get dna and regions (could be self)
   - contains zero or more subelements --allowing for multiple sources
   where to go to get sequence, region
   

AD: consider ATOM 'link' tag, designed for links to other stuff
  includes 'rel' attribute about how it is linked (e.g., could say:
  use this url to fetch assembly)

GH: these two cases are special enough that they deserve their own
elements and attributes

purpose: if you need to retrieve residues, it tells you the base uri
to go to get the residues.

AD: Don't we already have the sequence request for that?
GH: only reference servers need implement it.

LS: All we need to do is name the assembly in the
versioned sources response

AD: ebi/sanger tracks three fields related to assembly (what they need
per server):
-authority  = equiv to our assembly uri
-organism   = we have as taxon
-type       = ?

Permits people to query things like: find out all servers that offer ncbi
build 35 for human.

Question: What do they mean by 'coord system'? some confusion here
e.g., Do they mean things like: 'this assembly start at 5000 relative
to this other assembly'?

For protein DAS, authority typically defines two diff coord systems:
'pdb resnum, interprot'

It does not permit automated translation between two coord systems.
[A] - Andrew will find out what they use it for

AD: Believes the purpose is intended for human consumption.

LS: an easy fix to a long and persistent problem about identifying which
coord system was used. Can also use for taxon indexing. e.g. at ucsc -
select organism, select assembly of that organism. Could expand it for
kingdom, phyllum, family, etc. We should use the ncbi taxon id.

AD: Where does it go?

LS: eg. bos taurus,
    taxon_id=url to ncbi taxon id page
    name=bos taurus

GH: Coord system type is still unaccounted for. Is this describable by
seq ontology?  
LS: yes but why is it important?
AD: there are 2-3 diff coord systems for protein structure DAS

GH: they have contig, chrom, scaffold - so they're looking at which
level of the assembly is being annotated.

SC: Is there a use case for alternative coord systems in DAS/2 - do we
want to permit people to offer sequence in other than 0 interbase?
GH, AD: No.

GH: is /genome the highest level we want to go?

AD: at top level (sources response), would like to add more info:
- administrative contact: email, url for admin of server
- may want to put other things:
  - pointer to license agreement for use of this data, copyright,
    liability statements. attribute="legalese" href
 - doesn't need to be machine readable
LS: each data source may have a different leagalese (ebi has 100 diff
dbs). Each db may be under control of a diff group.
Should it go in sources or version tag?
AD: sources

GH: put off content-type, status code discussion until next time.
Looking at the http spec itself right now. A surprisingly good read.
well thought out (unlike some of the xml stuff).

LS: Would like to fix a mistake regarding the confounding of
namespaces and xml:base. Want to be consistent here.
- all attrib names use namespaces
- attrib values use relative uri's (xml:base)

SC: See my post here, which also addresses the handling of attribute
values that derive from a controlled vocabulary:
http://portal.open-bio.org/pipermail/das2/2005-November/000278.html
and Andrew's response today:
http://portal.open-bio.org/pipermail/das2/2005-November/000313.html

We need to address remaining spec issues in a separate call.
[A] Continue spec-focused teleconf in two weeks (28 Nov):
- namespaces/xml:base
- http header and status code
- anything else that comes up on the das/2 list.

[A] Next week (21 Nov): Discuss impl details about client, server,
validation suite

Future agenda: impl of writeback features (would like to hear from
ebi/sanger) 





More information about the DAS2 mailing list