[DAS] maxbins in DAS1.6?
Jonathan Warren
jw12 at sanger.ac.uk
Wed Sep 16 10:46:20 UTC 2009
which takes us back to my very first point that it would need to be a
command url in itself and specified otherwise how do you get the info
for a single source.
On 16 Sep 2009, at 11:28, Andy Jenkinson wrote:
> Taking aside the issue surrounding the paradigm I mentioned and
> Thomas expanded on, why do you actually need to have a URL for the
> "server" itself? Given you already have all the metadata and command
> URLs you can't learn anything more from it.
>
> On 16 Sep 2009, at 10:28, Jonathan Warren wrote:
>
>> I think Thomas is right in that we can't change the das1 base url
>> principle at least for 1.6 anyway, as it is supposed to be a
>> consolidation.
>>
>> As there have been no objections to using for example http://www.ebi.ac.uk/das-srv/genomicdas/das/eqtl_rat_cis_fat
>> as a single source request we can put that into 1.6. The only real
>> change would need to be in the registry. See explanation below. But
>> we can get around that.
>>
>>> What I meant was that the root URI isn't actually used for
>>> anything, at best it's just the location of the description you're
>>> already reading.
>> Except for the registry sources command where there is then no link
>> back to where the server you are talking about is (as you are not
>> at the server) apart from the query_uri's (example 1 below).
>>
>> das2 has "xml:base", but that is then for all sources so wouldn't
>> work for the registry see example 2 below. We could always add
>> another prop to the registry I guess ;)
>>
>>
>> example1 registry sources:
>> <SOURCES>
>> <SOURCE uri="DS_109" title="uniprot aristotle" doc_href="http://www.ebi.ac.uk/uniprot-das/
>> " description="This datasource (aristotle) is a legacy datasource
>> that comprises the new 'uniprot', 'ipi' and 'uniparc' datasources
>> that are available from the http://www.ebi.ac.uk/das-srv/uniprot/
>> das server. Despite being a legacy dsn, there are no plans to
>> remove this DAS datasource from service.">
>> <MAINTAINER email="rantunes at ebi.ac.uk" />
>> <VERSION uri="DS_109" created="2005-03-21T16:26:03+0000">
>> <COORDINATES uri="http://www.dasregistry.org/dasregistry/coordsys/CS_DS93
>> " source="Protein Sequence" authority="UniParc"
>> test_range="UPI00000017EA">UniParc,Protein Sequence</COORDINATES>
>> <COORDINATES uri="http://www.dasregistry.org/dasregistry/coordsys/CS_DS35
>> " source="Protein Sequence" authority="IPI"
>> test_range="IPI00000021">IPI,Protein Sequence</COORDINATES>
>> <COORDINATES uri="http://www.dasregistry.org/dasregistry/coordsys/CS_DS6
>> " source="Protein Sequence" authority="UniProt"
>> test_range="P00280">UniProt,Protein Sequence</COORDINATES>
>> <CAPABILITY type="das1:stylesheet" query_uri="http://www.ebi.ac.uk/das-srv/uniprot/das/aristotle/stylesheet
>> " />
>> <CAPABILITY type="das1:features" query_uri="http://www.ebi.ac.uk/das-srv/uniprot/das/aristotle/features
>> " />
>> <CAPABILITY type="das1:types" query_uri="http://www.ebi.ac.uk/das-srv/uniprot/das/aristotle/types
>> " />
>> <CAPABILITY type="das1:sequence" query_uri="http://www.ebi.ac.uk/das-srv/uniprot/das/aristotle/sequence
>> " />
>> <CAPABILITY type="das1:entry_points" query_uri="http://www.ebi.ac.uk/das-srv/uniprot/das/aristotle/entry_points
>> " />
>> <CAPABILITY type="das1:unknown_segment" query_uri="http://www.ebi.ac.uk/das-srv/uniprot/das/aristotle/unknown_segment
>> " />
>> <CAPABILITY type="das1:error_segment" query_uri="http://www.ebi.ac.uk/das-srv/uniprot/das/aristotle/error_segment
>> " />
>> <PROP name="label" value="Predicted" />
>> <PROP name="label" value="Manually curated" />
>> <PROP name="label" value="ENSEMBL" />
>> <PROP name="leaseTime" value="2009-09-15T11:00:15+0000" />
>> <PROP name="projectHome" value="http://www.biosapiens.info" />
>> <PROP name="projectIcon" value="http://www.dasregistry.org/ProjectIcon?id=74
>> " />
>> <PROP name="projectDesc" value="BioSapiens is a Network of
>> Excellence, funded by the European Union's 6th Framework Programme,
>> and made up of bioinformatics researchers from 25 institutions
>> based in 14 countries throughout Europe.
>>
>> The objective of the BioSapiens is to provide a large" />
>> <PROP name="projectName" value="BioSapiens" />
>> <PROP name="valid" value="stylesheet" />
>> <PROP name="valid" value="features" />
>> <PROP name="valid" value="types" />
>> <PROP name="valid" value="sequence" />
>> <PROP name="valid" value="entry_points" />
>> <PROP name="valid" value="error_segment" />
>> </VERSION>
>> </SOURCE>
>>
>>
>>
>>
>>
>> das2 has xml:base, but that is then for all sources so wouldn't
>> work for the registry:
>>
>> xml:base="http://bioserver.hci.utah.edu:8080/DAS2/das2/" >
>> <MAINTAINER email="david.nix at hci.utah.edu" />
>> <SOURCE uri="H_sapiens" title="H_sapiens" >
>> <VERSION uri="H_sapiens_Mar_2006" title="H_sapiens_Mar_2006"
>> created="2008-01-03 14:39:44" >
>> <COORDINATES uri="http://www.ncbi.nlm.nih.gov/genome/H_sapiens/B36.1/
>> " authority="NCBI" taxid="9606" version="36" source="Chromosome" />
>> <CAPABILITY type="segments" query_uri="H_sapiens_Mar_2006/
>> segments" />
>> <CAPABILITY type="types" query_uri="H_sapiens_Mar_2006/
>> types" />
>> <CAPABILITY type="features" query_uri="H_sapiens_Mar_2006/
>> features" />
>> </VERSION>
>> </SOURCE>
>>
>> On 16 Sep 2009, at 09:25, Andy Jenkinson wrote:
>>
>>> What I meant was that the root URI isn't actually used for
>>> anything, at best it's just the location of the description you're
>>> already reading. That would mean that adding another field to
>>> capture it wouldn't be of particular benefit.
>>>
>>> Whether we can easily remove the 'paradigm' of server/das/source/
>>> command without confusing people is something else!
>>>
>>> On 15 Sep 2009, at 18:11, Jonathan Warren wrote:
>>>
>>>> Andy I wasn't suggesting we get rid of query_uri - quite the
>>>> opposite in fact. just that the single source uri would have to
>>>> be specified with a uri as conceptually all other commands may
>>>> not contain the root uri. This also seems to me means we will
>>>> have to update das1 code to cope with multiple query uris.
>>>>
>>>> On 15 Sep 2009, at 17:56, Andy Jenkinson wrote:
>>>>
>>>>> On 15 Sep 2009, at 16:35, Jonathan Warren wrote:
>>>>>
>>>>>> I agree with Andy on both these (we talked about versioning
>>>>>> before).
>>>>>> The version numbers really have no meaning at the moment (no
>>>>>> web pages anywhere actually explain what a different version
>>>>>> means) and don't seem to be used at all in data sources ( I'm
>>>>>> guessing people end up just copying the version numbers from
>>>>>> examples given.
>>>>>>
>>>>>> I've always had an issue with the commands like this http://www.ebi.ac.uk/das-srv/genomicdas/das/eqtl_rat_cis_fat
>>>>>> not being a valid das command as it's the most natural request
>>>>>> for a person new to das to make. So giving it a specific
>>>>>> purpose and response is a good idea.
>>>>>>
>>>>>> My only concern is how to handle these if we start using the
>>>>>> power of multiple query_uri s per das source (inherited from
>>>>>> DAS2, which we have started to talk about, rather than the das1
>>>>>> style where all urls have a root) as currently there is no
>>>>>> "root" url specified in the DAS2 spec in the sources
>>>>>> document...?? So this would have to be specified as another
>>>>>> capability? or you could infer it from the features command,
>>>>>> but obviously not the sources cmd!!!
>>>>>
>>>>> My take on this is that the root URI identifies the source. In a
>>>>> conceptual sense the definition of a source is merely a
>>>>> combination of commands acting on a common set of data. It is
>>>>> not really important where that information comes from (a
>>>>> registry, a server, a flat file...) because a server by itself
>>>>> does not really mean anything. So the URI http://www.ebi.ac.uk/das-srv/genomicdas/das/eqtl_rat_cis_fat
>>>>> is not actually meaningful, even less so given it is not even a
>>>>> resolvable URL.
>>>>>
>>>>> The query URI system inherited from DAS/2 has the potential to
>>>>> allow the commands to be served from different locations on the
>>>>> web. It is not something we have needed up to now (all query
>>>>> URIs start with the same path), and does add confusion but I can
>>>>> see it being used for stylesheets. For example a "sequence
>>>>> ontology stylesheet" served from a single location.
>>>>>
>>>>> But the biggest reason to have it is because of the registry.
>>>>> The registry assigns its own root URIs for a DAS source (e.g.
>>>>> DS_1234), which means it is necessary to provide another URI
>>>>> used to actually query it. Since we already have a way of doing
>>>>> it in the sources document, I don't really want to change it
>>>>> now. It seems we might as well just embrace the extra
>>>>> flexibility and merely describe it better.
>>>>>
>>>>>> On 15 Sep 2009, at 15:47, Andy Jenkinson wrote:
>>>>>>
>>>>>>> On 15 Sep 2009, at 15:19, Thomas Down wrote:
>>>>>>>> Capabilities are stated in the sources document:
>>>>>>>> <CAPABILITY type="das1:maxbins" />
>>>>>>>>
>>>>>>>> Ah, interesting. I'd seen that, of course, but hadn't
>>>>>>>> explicitly linked this with the idea of capabilities as
>>>>>>>> listed in the X-DAS-Capabilities header (although of course
>>>>>>>> it makes a lot more sense to have one set of capability
>>>>>>>> metadata, rather than two!). There are a couple of issues here:
>>>>>>>>
>>>>>>>> 1. The SOURCES examples all say "das command" in the
>>>>>>>> type attribute of the CAPABILITY element, whereas many of the
>>>>>>>> capabilities don't actually map to commands. I notice that
>>>>>>>> the latest DAS1.6 draft does give an example to clarify this.
>>>>>>>>
>>>>>>>> 2. X-DAS-Capabilities entries are versioned whereas
>>>>>>>> SOURCES capabilities aren't, which makes them look rather
>>>>>>>> different. (and I note that the 1.6 spec is bumping up the
>>>>>>>> version numbers on some of the existing capabilities...)
>>>>>>>>
>>>>>>>> How about versioning capabilities in SOURCES, e.g.:
>>>>>>>>
>>>>>>>> <CAPABILITY type="features" version="1.1" query_uri="http://noranti.derkholm.net/das/mydata/features
>>>>>>>> " />
>>>>>>>> <CAPABILITY type="maxbins" version="1.0" />
>>>>>>>>
>>>>>>>> Assume any missing version attributes are "1.0" and
>>>>>>>> everything should be backwards compatible.
>>>>>>>
>>>>>>> Indeed I did increment the version, just because it seemed the
>>>>>>> right thing to do. However as far as I am aware these per-
>>>>>>> capability versions are totally superfluous when taken in
>>>>>>> context with the X-DAS-Version header, i.e. we do NOT want to
>>>>>>> make it possible to implement DAS 1.6 and features 1.0, for
>>>>>>> example. This could create a whole world of pain!
>>>>>>>
>>>>>>> IMO the per-capability version is unnecessary and confusing.
>>>>>>> ProServer does use it internally, but that can be easily
>>>>>>> changed. Getting rid of it would make the spec less confusing,
>>>>>>> but will of course break things that depend on the current
>>>>>>> format (if there are any).
>>>>>>>
>>>>>>> What do others think?
>>>>>>>
>>>>>>>> The only snag is that right now you have to parse all
>>>>>>>> sources. Technically both the registry and proserver allow
>>>>>>>> you do do:
>>>>>>>> http://www.ebi.ac.uk/das-srv/genomicdas/das/sources/eqtl_rat_cis_fat
>>>>>>>>
>>>>>>>> But IIRC I didn't include this in the spec to keep things
>>>>>>>> simple.
>>>>>>>>
>>>>>>>> If this isn't specified yet, how about allowing:
>>>>>>>>
>>>>>>>> http://www.ebi.ac.uk/das-srv/genomicdas/das/eqtl_rat_cis_fat/sources
>>>>>>>>
>>>>>>>> ?
>>>>>>>>
>>>>>>>> Then it's possible to stick with the model of passing a
>>>>>>>> single URI around to refer to a "DAS datasource", and stick a
>>>>>>>> command on the end of it to get the data you're after.
>>>>>>>
>>>>>>> Well, the reason we didn't use this format is simply that it
>>>>>>> doesn't "read" well, if only because "sources" is plural. What
>>>>>>> would perhaps make sense, and which would allow for quickly
>>>>>>> 'pinging' a source for other similar uses, is to use this URL
>>>>>>> format:
>>>>>>> http://www.ebi.ac.uk/das-srv/genomicdas/das/eqtl_rat_cis_fat
>>>>>>>
>>>>>>> Again, this is what seems most 'sensible' to me but I am happy
>>>>>>> to go with the consensus.
>>>>>>> _______________________________________________
>>>>>>> DAS mailing list
>>>>>>> DAS at lists.open-bio.org
>>>>>>> http://lists.open-bio.org/mailman/listinfo/das
>>>>>>
>>>>>> Jonathan Warren
>>>>>> Senior Developer and DAS coordinator
>>>>>> jw12 at sanger.ac.uk
>>>>>> Ext: 2314
>>>>>> Telephone: 01223 492314
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> The Wellcome Trust Sanger Institute is operated by Genome
>>>>>> ResearchLimited, a charity registered in England with number
>>>>>> 1021457 and acompany registered in England with number 2742969,
>>>>>> whose registeredoffice is 215 Euston Road, London, NW1 2BE.
>>>>>
>>>>
>>>> Jonathan Warren
>>>> Senior Developer and DAS coordinator
>>>> jw12 at sanger.ac.uk
>>>> Ext: 2314
>>>> Telephone: 01223 492314
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> The Wellcome Trust Sanger Institute is operated by Genome
>>>> ResearchLimited, a charity registered in England with number
>>>> 1021457 and acompany registered in England with number 2742969,
>>>> whose registeredoffice is 215 Euston Road, London, NW1 2BE.
>>>
>>
>> Jonathan Warren
>> Senior Developer and DAS coordinator
>> jw12 at sanger.ac.uk
>> Ext: 2314
>> Telephone: 01223 492314
>>
>>
>>
>>
>>
>>
>> -- The Wellcome Trust Sanger Institute is operated by Genome
>> Research Limited, a charity registered in England with number
>> 1021457 and a company registered in England with number 2742969,
>> whose registered office is 215 Euston Road, London, NW1 2BE.
>
Jonathan Warren
Senior Developer and DAS coordinator
jw12 at sanger.ac.uk
Ext: 2314
Telephone: 01223 492314
--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
More information about the DAS
mailing list