[DAS2] das registry and das2

Andrew Dalke dalke at dalkescientific.com
Mon Nov 21 17:25:06 UTC 2005


Andreas Prlic wrote:
> Therefore the "coordinate system" or "namespace" is an important part  
> of the description of a DAS source.
>
> What I found in the current spec-draft that comes closest to this  
> issue is the different "domains"
> e.g
>
> http://server/das/genome/source/version/features
>
> so I might want to say
> http://server/das/genome/homosapiens/ncbi35/features
> http://server/das/genome/musmusculus/ncbim34/features
>
> or should it be
> http://server/das/genome/ncbi/homosapiens35/features
> http://server/das/genome/ncbi/musmusculus34/features
> ?
>
> Hm. I am not sure, but it seems that one level is missing? - either  
> organism or authority ?

The species information is available from the data source from the  
'taxon'
attribute, as in

   <SOURCE id="volvox" description="Volvox Example Database"
           taxon="http://www.ncbi.nlm.nih.gov/taxon-browser?id=29118"
      
doc_href="http://www.wormbase.org/documentation/users_guide/ 
volvox.html" >

It's not available through a URL naming.  That's arbitrary in that
the data provider can use any term.

I think there's nothing to preclude a provider from putting the
actual source data one level deeper in the tree.  Personally I
find that that's over-classification.  Who would use it?

> Currently the registry provides a restricted list of allowed
> coordinate systems, to keep this under control.

Thomas:
> This is possibly an argument for avoiding the use of URLs for assembly  
> identifiers, if we can't be sure that the organisation that's the  
> authority for a given assembly will be running an authoritative DAS  
> server.  URNs would be fine, as would the kind of structured but  
> location-independent identifer that Andreas has been using.

I think there's no reason we can't use our own names for these.  Eg,
   http://www.biodas.org/coordinates/NCBI35
or a simple unique id like "NCBI35".

Right now those are treated as opaque identifiers.  There's no name
resolution going on, and the coordinates are (I assume) implicit in
that client software doesn't resolve the name, only check that the
servers are returning data from the same coordinate system.

Perhaps in the future that URL might resolve to something, but there's
no current reason to do so.

In the renewal grant there is reason to compare different coordinates.
When that happens a client needs to pick one reference frame and get
the translation information to the other.  So the liftover service
needs to know about the two coordindate systems.  But it can be done
through hard-coded information (perhaps with some information that
coordinate system X is an alias for Y).  I still don't think there's
any need to resolve these URLs.

Andreas:
>> Are multiple search terms allowed?
>
> yes

Then they should likely be along the same lines used for the DAS/2
searching.

>> Boolean AND or OR?
>
> We can add a parameter where this can be chosen.

The existing DAS/2 uses an AND search only.  Rather "OR" for
multiple fields of the same data type and "AND" across different
fields.

					Andrew
					dalke at dalkescientific.com




More information about the DAS2 mailing list