[DAS2] DAS2 source description

Andreas Prlic ap3 at sanger.ac.uk
Thu Dec 8 09:48:58 UTC 2005


The way Andrew suggests the source description looks already quite good 
to me.
Could we add a couple things?

* we have some people doing annotations on clones and scaffolds, which 
-regarding DAS-
  is essentially the same as  annotating in chromosomal coordinates, but 
for the description
a few other types of coordinate systems are needed.

* there are a couple of sources that can speak multiple "coordinate 
systems", so the
<source> description should be able to deal with that.

* It would be good to have something like an "authority" field in the 
coordinate systems. i.e. the institution who
defines a set of reference objects.

with this in mind one could do something like:

<SOURCE
	id="myHomoSapiensAnnotation"
	description="serves annotations for human in chromosome and clone 
coordinates " >

     <namespace>
           taxon="http://www.ncbi.nlm.nih.gov/taxon-browser?id=9606"
           source_type="chromosome"
	authority_name="NCBI"
           >
      <VERSION id="35" />
     </namespace>

    <namespace
      	taxon="http://www.ncbi.nlm.nih.gov/taxon-browser?id=9606"
           source_type="clone"
	authority="EMBL"
	/>

   </SOURCE>


This would be the part that is needed for describing the actual data 
and then it would be good to have some
other  meta info for the sources as well:

* which DAS commands does a source understand
* a testcode (per namespace) that can be used to validate responses
* some historical data like "has been available since" "was 
successfully validated the last time at"
* a link back to the homepage of the group that provides the source for 
more detailed docu about the data
* an email address to contact if there is a problem/question with the 
source
* a "nickname" for a source that should be used in a DAS client to 
label tracks coming from that source.
* some optional properties that can be added like "funded by ..." "GO 
evidence code: "



> That is, the SOURCES request returns information about genomic,
> protein sequence and structure databases.

good. - plus a couple of others. this should be a restricted list.


> If this occurs then there will need to be a few changes to the spec.
> For example, 'taxon' is probably only properly part of the genomic
> sources


some people annotate protein sequences from a particular organism.
e.g there is a DAS1 source that only annotates Fugu protein sequences


Cheers,
Andreas

-----------------------------------------------------------------------

Andreas Prlic      Wellcome Trust Sanger Institute
                               Hinxton, Cambridge CB10 1SA, UK
			 +44 (0) 1223 49 6891




More information about the DAS2 mailing list