[DAS2] das registry and das2

Andreas Prlic ap3 at sanger.ac.uk
Mon Nov 21 10:55:06 UTC 2005


Hi Andrew,

> As it's structured now the top-level interface to a das2/genome URL
> returns a list of sources.  Based on what you need for the registry,
> we're going to add support for data about the source itself.
>
> The resulting das-sources XML document is effectively identical to
> what you're looking for.

that sounds good. I agree the description should look identical for  
both the
sources and the registry. If the sources are already properly described  
this also
makes it easier to "publish" them.

I think most of the fields in the registry are rather clear why they  
are there. The issue that might need
most discussion might be how to describe a coordinate system. This  
information
is important because a DAS client usually understands one or multiple  
coordinate systems.
E.g. Ensembl knows about Chromosomes and Clones,  but it can also  
display UniProt
annotations in some cases. Similar the SPICE DAS client can display  
annotations served in PDB-residue
numbering and UniProt coordinates, but does not know how to deal with  
genomic coordinates.
Therefore the "coordinate system" or "namespace" is an important part  
of the description of a DAS source.

What I found in the current spec-draft that comes closest to this issue  
is the different "domains"
e.g

http://server/das/genome/source/version/features

so I might want to say
http://server/das/genome/homosapiens/ncbi35/features
http://server/das/genome/musmusculus/ncbim34/features

or should it be
http://server/das/genome/ncbi/homosapiens35/features
http://server/das/genome/ncbi/musmusculus34/features
?

Hm. I am not sure, but it seems that one level is missing? - either  
organism or authority ?

The description of the data finally should allow to use the same DAS  
source in multiple DAS-clients.
Some validation will be required on the descriptions, to warn people  
that "homo sapiens" should not be
written as "human" or "homo". or more complicated: Ensembl does not do  
assemblies itself. The assembly
used is currently NCBI_35. Therefore "Ensembl" can not be used as an  
authority for a chromosomal
  coordinate system.
Currently the registry provides a restricted list of allowed coordinate  
systems, to keep this under control.


>> http://server/registry/list
>> http://server/registry/find? 
>> [keyword,organism,authority,type,capability,label]=searchterm
>
> My proposal doesn't affect this.
>
> Why do "find" and "list" take different URLs?  Another possibility
> is that the same URL returns everything if there are no filters
> in place.


yes - better use only one url.  no filters would return all sources.


>
> Are multiple search terms allowed?

yes

> Boolean AND or OR?

We can add a parameter where this can be chosen.

Greetings,
Andreas

-----------------------------------------------------------------------

Andreas Prlic      Wellcome Trust Sanger Institute
                               Hinxton, Cambridge CB10 1SA, UK
			 +44 (0) 1223 49 6891




More information about the DAS2 mailing list