[DAS] maxbins in DAS1.6?

Jonathan Warren jw12 at sanger.ac.uk
Wed Sep 16 10:46:20 UTC 2009


which takes us back to my very first point that it would need to be a  
command url in itself and specified otherwise how do you get the info  
for a single source.

On 16 Sep 2009, at 11:28, Andy Jenkinson wrote:

> Taking aside the issue surrounding the paradigm I mentioned and  
> Thomas expanded on, why do you actually need to have a URL for the  
> "server" itself? Given you already have all the metadata and command  
> URLs you can't learn anything more from it.
>
> On 16 Sep 2009, at 10:28, Jonathan Warren wrote:
>
>> I think Thomas is right in that we can't change the das1 base url  
>> principle at least for 1.6 anyway, as it is supposed to be a  
>> consolidation.
>>
>> As there have been no objections to using for example http://www.ebi.ac.uk/das-srv/genomicdas/das/eqtl_rat_cis_fat 
>>  as a single source request we can put that into 1.6. The only real  
>> change would need to be in the registry. See explanation below. But  
>> we can get around that.
>>
>>> What I meant was that the root URI isn't actually used for  
>>> anything, at best it's just the location of the description you're  
>>> already reading.
>> Except for the registry sources command where there is then no link  
>> back to where the server you are talking about is (as you are not  
>> at the server) apart from the query_uri's (example 1 below).
>>
>> das2 has "xml:base", but that is then for all sources so wouldn't  
>> work for the registry see example 2 below. We could always add  
>> another prop to the registry I guess ;)
>>
>>
>> example1 registry sources:
>> <SOURCES>
>>  <SOURCE uri="DS_109" title="uniprot aristotle" doc_href="http://www.ebi.ac.uk/uniprot-das/ 
>> " description="This datasource (aristotle) is a legacy  datasource  
>> that comprises the new  'uniprot', 'ipi' and 'uniparc'  datasources  
>> that are available from the  http://www.ebi.ac.uk/das-srv/uniprot/ 
>> das  server.  Despite being a legacy dsn,  there are no plans to  
>> remove this DAS  datasource from service.">
>>    <MAINTAINER email="rantunes at ebi.ac.uk" />
>>    <VERSION uri="DS_109" created="2005-03-21T16:26:03+0000">
>>      <COORDINATES uri="http://www.dasregistry.org/dasregistry/coordsys/CS_DS93 
>> " source="Protein Sequence" authority="UniParc"  
>> test_range="UPI00000017EA">UniParc,Protein Sequence</COORDINATES>
>>      <COORDINATES uri="http://www.dasregistry.org/dasregistry/coordsys/CS_DS35 
>> " source="Protein Sequence" authority="IPI"  
>> test_range="IPI00000021">IPI,Protein Sequence</COORDINATES>
>>      <COORDINATES uri="http://www.dasregistry.org/dasregistry/coordsys/CS_DS6 
>> " source="Protein Sequence" authority="UniProt"  
>> test_range="P00280">UniProt,Protein Sequence</COORDINATES>
>>      <CAPABILITY type="das1:stylesheet" query_uri="http://www.ebi.ac.uk/das-srv/uniprot/das/aristotle/stylesheet 
>> " />
>>      <CAPABILITY type="das1:features" query_uri="http://www.ebi.ac.uk/das-srv/uniprot/das/aristotle/features 
>> " />
>>      <CAPABILITY type="das1:types" query_uri="http://www.ebi.ac.uk/das-srv/uniprot/das/aristotle/types 
>> " />
>>      <CAPABILITY type="das1:sequence" query_uri="http://www.ebi.ac.uk/das-srv/uniprot/das/aristotle/sequence 
>> " />
>>      <CAPABILITY type="das1:entry_points" query_uri="http://www.ebi.ac.uk/das-srv/uniprot/das/aristotle/entry_points 
>> " />
>>      <CAPABILITY type="das1:unknown_segment" query_uri="http://www.ebi.ac.uk/das-srv/uniprot/das/aristotle/unknown_segment 
>> " />
>>      <CAPABILITY type="das1:error_segment" query_uri="http://www.ebi.ac.uk/das-srv/uniprot/das/aristotle/error_segment 
>> " />
>>      <PROP name="label" value="Predicted" />
>>      <PROP name="label" value="Manually curated" />
>>      <PROP name="label" value="ENSEMBL" />
>>      <PROP name="leaseTime" value="2009-09-15T11:00:15+0000" />
>>      <PROP name="projectHome" value="http://www.biosapiens.info" />
>>      <PROP name="projectIcon" value="http://www.dasregistry.org/ProjectIcon?id=74 
>> " />
>>      <PROP name="projectDesc" value="BioSapiens is a Network of  
>> Excellence, funded by the European Union's 6th Framework Programme,  
>> and made up of bioinformatics researchers from 25 institutions  
>> based in 14 countries throughout Europe.
>>
>> The objective of the BioSapiens is to provide a large" />
>>      <PROP name="projectName" value="BioSapiens" />
>>      <PROP name="valid" value="stylesheet" />
>>      <PROP name="valid" value="features" />
>>      <PROP name="valid" value="types" />
>>      <PROP name="valid" value="sequence" />
>>      <PROP name="valid" value="entry_points" />
>>      <PROP name="valid" value="error_segment" />
>>    </VERSION>
>>  </SOURCE>
>>
>>
>>
>>
>>
>> das2 has xml:base, but that is then for all sources so wouldn't  
>> work for the registry:
>>
>> xml:base="http://bioserver.hci.utah.edu:8080/DAS2/das2/" >
>>  <MAINTAINER email="david.nix at hci.utah.edu" />
>>  <SOURCE uri="H_sapiens" title="H_sapiens" >
>>      <VERSION uri="H_sapiens_Mar_2006" title="H_sapiens_Mar_2006"  
>> created="2008-01-03 14:39:44" >
>>           <COORDINATES uri="http://www.ncbi.nlm.nih.gov/genome/H_sapiens/B36.1/ 
>> " authority="NCBI" taxid="9606" version="36" source="Chromosome" />
>>           <CAPABILITY type="segments" query_uri="H_sapiens_Mar_2006/ 
>> segments" />
>>           <CAPABILITY type="types" query_uri="H_sapiens_Mar_2006/ 
>> types" />
>>           <CAPABILITY type="features" query_uri="H_sapiens_Mar_2006/ 
>> features" />
>>      </VERSION>
>>  </SOURCE>
>>
>> On 16 Sep 2009, at 09:25, Andy Jenkinson wrote:
>>
>>> What I meant was that the root URI isn't actually used for  
>>> anything, at best it's just the location of the description you're  
>>> already reading. That would mean that adding another field to  
>>> capture it wouldn't be of particular benefit.
>>>
>>> Whether we can easily remove the 'paradigm' of server/das/source/ 
>>> command without confusing people is something else!
>>>
>>> On 15 Sep 2009, at 18:11, Jonathan Warren wrote:
>>>
>>>> Andy I wasn't suggesting we get rid of query_uri - quite the  
>>>> opposite in fact. just that the single source uri would have to  
>>>> be specified with a uri as conceptually all other commands may  
>>>> not contain the root uri. This also seems to me means we will  
>>>> have to update das1 code to cope with multiple query uris.
>>>>
>>>> On 15 Sep 2009, at 17:56, Andy Jenkinson wrote:
>>>>
>>>>> On 15 Sep 2009, at 16:35, Jonathan Warren wrote:
>>>>>
>>>>>> I agree with Andy on both these (we talked about versioning  
>>>>>> before).
>>>>>> The version numbers really have no meaning at the moment (no  
>>>>>> web pages anywhere actually explain what a different version  
>>>>>> means) and don't seem to be used at all in data sources ( I'm  
>>>>>> guessing people end up just copying the version numbers from  
>>>>>> examples given.
>>>>>>
>>>>>> I've always had an issue with the commands like this http://www.ebi.ac.uk/das-srv/genomicdas/das/eqtl_rat_cis_fat 
>>>>>>  not being a valid das command as it's the most natural request  
>>>>>> for a person new to das to make. So giving it a specific  
>>>>>> purpose and response is a good idea.
>>>>>>
>>>>>> My only concern is how to handle these if we start using the  
>>>>>> power of multiple query_uri s per das source (inherited from  
>>>>>> DAS2, which we have started to talk about, rather than the das1  
>>>>>> style where all urls have a root) as currently there is no  
>>>>>> "root" url specified in the DAS2 spec in the sources  
>>>>>> document...?? So this would have to be specified as another  
>>>>>> capability? or you could infer it from the features command,  
>>>>>> but obviously not the sources cmd!!!
>>>>>
>>>>> My take on this is that the root URI identifies the source. In a  
>>>>> conceptual sense the definition of a source is merely a  
>>>>> combination of commands acting on a common set of data. It is  
>>>>> not really important where that information comes from (a  
>>>>> registry, a server, a flat file...) because a server by itself  
>>>>> does not really mean anything. So the URI http://www.ebi.ac.uk/das-srv/genomicdas/das/eqtl_rat_cis_fat 
>>>>>  is not actually meaningful, even less so given it is not even a  
>>>>> resolvable URL.
>>>>>
>>>>> The query URI system inherited from DAS/2 has the potential to  
>>>>> allow the commands to be served from different locations on the  
>>>>> web. It is not something we have needed up to now (all query  
>>>>> URIs start with the same path), and does add confusion but I can  
>>>>> see it being used for stylesheets. For example a "sequence  
>>>>> ontology stylesheet" served from a single location.
>>>>>
>>>>> But the biggest reason to have it is because of the registry.  
>>>>> The registry assigns its own root URIs for a DAS source (e.g.  
>>>>> DS_1234), which means it is necessary to provide another URI  
>>>>> used to actually query it. Since we already have a way of doing  
>>>>> it in the sources document, I don't really want to change it  
>>>>> now. It seems we might as well just embrace the extra  
>>>>> flexibility and merely describe it better.
>>>>>
>>>>>> On 15 Sep 2009, at 15:47, Andy Jenkinson wrote:
>>>>>>
>>>>>>> On 15 Sep 2009, at 15:19, Thomas Down wrote:
>>>>>>>> Capabilities are stated in the sources document:
>>>>>>>> <CAPABILITY type="das1:maxbins" />
>>>>>>>>
>>>>>>>> Ah, interesting.  I'd seen that, of course, but hadn't  
>>>>>>>> explicitly linked this with the idea of capabilities as  
>>>>>>>> listed in the X-DAS-Capabilities header (although of course  
>>>>>>>> it makes a lot more sense to have one set of capability  
>>>>>>>> metadata, rather than two!). There are a couple of issues here:
>>>>>>>>
>>>>>>>>       1. The SOURCES examples all say "das command" in the  
>>>>>>>> type attribute of the CAPABILITY element, whereas many of the  
>>>>>>>> capabilities don't actually map to commands.  I notice that  
>>>>>>>> the latest DAS1.6 draft does give an example to clarify this.
>>>>>>>>
>>>>>>>>       2. X-DAS-Capabilities entries are versioned whereas  
>>>>>>>> SOURCES capabilities aren't, which makes them look rather  
>>>>>>>> different. (and I note that the 1.6 spec is bumping up the  
>>>>>>>> version numbers on some of the existing capabilities...)
>>>>>>>>
>>>>>>>> How about versioning capabilities in SOURCES, e.g.:
>>>>>>>>
>>>>>>>>    <CAPABILITY type="features" version="1.1" query_uri="http://noranti.derkholm.net/das/mydata/features 
>>>>>>>> " />
>>>>>>>>    <CAPABILITY type="maxbins" version="1.0" />
>>>>>>>>
>>>>>>>> Assume any missing version attributes are "1.0" and  
>>>>>>>> everything should be backwards compatible.
>>>>>>>
>>>>>>> Indeed I did increment the version, just because it seemed the  
>>>>>>> right thing to do. However as far as I am aware these per- 
>>>>>>> capability versions are totally superfluous when taken in  
>>>>>>> context with the X-DAS-Version header, i.e. we do NOT want to  
>>>>>>> make it possible to implement DAS 1.6 and features 1.0, for  
>>>>>>> example. This could create a whole world of pain!
>>>>>>>
>>>>>>> IMO the per-capability version is unnecessary and confusing.  
>>>>>>> ProServer does use it internally, but that can be easily  
>>>>>>> changed. Getting rid of it would make the spec less confusing,  
>>>>>>> but will of course break things that depend on the current  
>>>>>>> format (if there are any).
>>>>>>>
>>>>>>> What do others think?
>>>>>>>
>>>>>>>> The only snag is that right now you have to parse all  
>>>>>>>> sources. Technically both the registry and proserver allow  
>>>>>>>> you do do:
>>>>>>>> http://www.ebi.ac.uk/das-srv/genomicdas/das/sources/eqtl_rat_cis_fat
>>>>>>>>
>>>>>>>> But IIRC I didn't include this in the spec to keep things  
>>>>>>>> simple.
>>>>>>>>
>>>>>>>> If this isn't specified yet, how about allowing:
>>>>>>>>
>>>>>>>>        http://www.ebi.ac.uk/das-srv/genomicdas/das/eqtl_rat_cis_fat/sources
>>>>>>>>
>>>>>>>> ?
>>>>>>>>
>>>>>>>> Then it's possible to stick with the model of passing a  
>>>>>>>> single URI around to refer to a "DAS datasource", and stick a  
>>>>>>>> command on the end of it to get the data you're after.
>>>>>>>
>>>>>>> Well, the reason we didn't use this format is simply that it  
>>>>>>> doesn't "read" well, if only because "sources" is plural. What  
>>>>>>> would perhaps make sense, and which would allow for quickly  
>>>>>>> 'pinging' a source for other similar uses, is to use this URL  
>>>>>>> format:
>>>>>>> http://www.ebi.ac.uk/das-srv/genomicdas/das/eqtl_rat_cis_fat
>>>>>>>
>>>>>>> Again, this is what seems most 'sensible' to me but I am happy  
>>>>>>> to go with the consensus.
>>>>>>> _______________________________________________
>>>>>>> DAS mailing list
>>>>>>> DAS at lists.open-bio.org
>>>>>>> http://lists.open-bio.org/mailman/listinfo/das
>>>>>>
>>>>>> Jonathan Warren
>>>>>> Senior Developer and DAS coordinator
>>>>>> jw12 at sanger.ac.uk
>>>>>> Ext: 2314
>>>>>> Telephone: 01223 492314
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> The Wellcome Trust Sanger Institute is operated by Genome  
>>>>>> ResearchLimited, a charity registered in England with number  
>>>>>> 1021457 and acompany registered in England with number 2742969,  
>>>>>> whose registeredoffice is 215 Euston Road, London, NW1 2BE.
>>>>>
>>>>
>>>> Jonathan Warren
>>>> Senior Developer and DAS coordinator
>>>> jw12 at sanger.ac.uk
>>>> Ext: 2314
>>>> Telephone: 01223 492314
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> -- 
>>>> The Wellcome Trust Sanger Institute is operated by Genome  
>>>> ResearchLimited, a charity registered in England with number  
>>>> 1021457 and acompany registered in England with number 2742969,  
>>>> whose registeredoffice is 215 Euston Road, London, NW1 2BE.
>>>
>>
>> Jonathan Warren
>> Senior Developer and DAS coordinator
>> jw12 at sanger.ac.uk
>> Ext: 2314
>> Telephone: 01223 492314
>>
>>
>>
>>
>>
>>
>> -- The Wellcome Trust Sanger Institute is operated by Genome  
>> Research Limited, a charity registered in England with number  
>> 1021457 and a company registered in England with number 2742969,  
>> whose registered office is 215 Euston Road, London, NW1 2BE.
>

Jonathan Warren
Senior Developer and DAS coordinator
jw12 at sanger.ac.uk
Ext: 2314
Telephone: 01223 492314







-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 



More information about the DAS mailing list