[DAS2] DAS/2 weekly meeting notes for 14 Nov 05
Steve Chervitz
Steve_Chervitz at affymetrix.com
Mon Nov 14 19:29:22 UTC 2005
Notes from the weekly DAS/2 teleconference, 14 Nov 2005.
$Id: das2-teleconf-2005-11-14.txt,v 1.2 2005/11/14 19:20:37 sac Exp $
Attendees:
Affy: Steve Chervitz, Gregg Helt
CSHL: Lincoln Stein
UCBerkeley: Suzi Lewis
Sweden: Andrew Dalke
Action items are flagged with '[A]'.
These notes are checked into the biodas.org CVS repository at
das/das2/notes/2005. Instructions on how to access this
repository are at http://biodas.org
----------------------------------
AD talked with A. Prlic about registry service, we want to incorporate
what he needs within DAS/2.
What they have:
- name (a few words) - for display of das track
- title, description (paragraph) - synopsis
- url for more info
we have desc, id, doc_href, taxon
Therefore, we need name attribute
Need :
- name (mandatory) (done - LS: adding it to spec now)
- desc (optional)
Coord system reg server:
* in das/2 - it's not optional (0 interbase)
* they find this important
We have confusion between assembly and reference server
LS: Need URI that points to assembly, independent of the
reference server.
GH: Would like to have annot servers that don't know anything about
the ref server.
LS: Could use the region URI to ID the assembly
das/genome/sourceid/region = assembly id/uri
GH: The trouble is that NCBI is a ref source for many assemblies, yet
they lack a das sever. They have no URI.
LS: we can just make one up, or use most appropriate web page
LS: When you request versioned source from a server, it should say what
assembly coords it's working on and give a uri for that. In this case
there's no guarantee you can do a 'get' on that URI.
We want to say:
1- what is unique uri for assembly (everyone agrees to share this)
2- das URL for how to fetch it (some server's region url - trusted,
faithful copy with what is at ncbi). Diff servers could assert that
you can fetch it from various places.
GH: assembly could be an attribute since there'd be only one.
A list of ref servers that serve up that dna.
LS: in versioned source response. new section between capabilities and
namespaces called 'reference_sources'. Add 'assembly' attribute to
version element:
<version
id=
desc=
assembly="" uri that describes assembly - mandatory
<reference_sources
- tells you where to get dna and regions (could be self)
- contains zero or more subelements --allowing for multiple sources
where to go to get sequence, region
AD: consider ATOM 'link' tag, designed for links to other stuff
includes 'rel' attribute about how it is linked (e.g., could say:
use this url to fetch assembly)
GH: these two cases are special enough that they deserve their own
elements and attributes
purpose: if you need to retrieve residues, it tells you the base uri
to go to get the residues.
AD: Don't we already have the sequence request for that?
GH: only reference servers need implement it.
LS: All we need to do is name the assembly in the
versioned sources response
AD: ebi/sanger tracks three fields related to assembly (what they need
per server):
-authority = equiv to our assembly uri
-organism = we have as taxon
-type = ?
Permits people to query things like: find out all servers that offer ncbi
build 35 for human.
Question: What do they mean by 'coord system'? some confusion here
e.g., Do they mean things like: 'this assembly start at 5000 relative
to this other assembly'?
For protein DAS, authority typically defines two diff coord systems:
'pdb resnum, interprot'
It does not permit automated translation between two coord systems.
[A] - Andrew will find out what they use it for
AD: Believes the purpose is intended for human consumption.
LS: an easy fix to a long and persistent problem about identifying which
coord system was used. Can also use for taxon indexing. e.g. at ucsc -
select organism, select assembly of that organism. Could expand it for
kingdom, phyllum, family, etc. We should use the ncbi taxon id.
AD: Where does it go?
LS: eg. bos taurus,
taxon_id=url to ncbi taxon id page
name=bos taurus
GH: Coord system type is still unaccounted for. Is this describable by
seq ontology?
LS: yes but why is it important?
AD: there are 2-3 diff coord systems for protein structure DAS
GH: they have contig, chrom, scaffold - so they're looking at which
level of the assembly is being annotated.
SC: Is there a use case for alternative coord systems in DAS/2 - do we
want to permit people to offer sequence in other than 0 interbase?
GH, AD: No.
GH: is /genome the highest level we want to go?
AD: at top level (sources response), would like to add more info:
- administrative contact: email, url for admin of server
- may want to put other things:
- pointer to license agreement for use of this data, copyright,
liability statements. attribute="legalese" href
- doesn't need to be machine readable
LS: each data source may have a different leagalese (ebi has 100 diff
dbs). Each db may be under control of a diff group.
Should it go in sources or version tag?
AD: sources
GH: put off content-type, status code discussion until next time.
Looking at the http spec itself right now. A surprisingly good read.
well thought out (unlike some of the xml stuff).
LS: Would like to fix a mistake regarding the confounding of
namespaces and xml:base. Want to be consistent here.
- all attrib names use namespaces
- attrib values use relative uri's (xml:base)
SC: See my post here, which also addresses the handling of attribute
values that derive from a controlled vocabulary:
http://portal.open-bio.org/pipermail/das2/2005-November/000278.html
and Andrew's response today:
http://portal.open-bio.org/pipermail/das2/2005-November/000313.html
We need to address remaining spec issues in a separate call.
[A] Continue spec-focused teleconf in two weeks (28 Nov):
- namespaces/xml:base
- http header and status code
- anything else that comes up on the das/2 list.
[A] Next week (21 Nov): Discuss impl details about client, server,
validation suite
Future agenda: impl of writeback features (would like to hear from
ebi/sanger)
More information about the DAS2
mailing list