[DAS] Re: [Call to action] Retrieval of positions from feature identifiers

Lincoln Stein lstein@cshl.org
Tue, 27 Nov 2001 10:18:31 -0500


Hi Matthew,

I see this as just a hack for Omniview; completely optional, not to be
depended on.  It will give us some experience with the requirements
for this type of functionality so that we can answer the questions in
(2).

There's obviously significant overlap between class and type.  However 
I think that it's worth making a distinction:

  - the feature type relates to an annotation, for example a
    similarity hit.  Even things that don't have ID's, like the
    zillionth ALU on chromosome 3, should have a feature type

  - the feature class relates directly to a biological object, for
    example a named gene.  These are things that we want to have
    well known, published identifiers for.

As an aside, I'm thinking that our biological objects should be
identified using a triplet:

  (class, namespace, id)

  - The class is a small set of identifiers referring to shared data
  types.  My list would include Sequence, Protein, Map, Motif, Taxa,
  and Database.  The class is a contract that specifies the XML schema
  associated with the object.

  - The namespace qualifies the ID, making it globally unique.  For
  example, ncbi.nlm.nih.gov/entrez/protein

  - The ID is the unique identifier within the namespace.

Any thoughts on this?

On the subject of UDDI, I've read through the specs, but the language
that they use "business contacts", "access points", "businessEntity",
"businessKey" is totally business oriented.  There is certainly a lot
that we can borrow from the specification, but to implement the 40
SOAP messages required by a fully compliant UDDI registry is not only
overkill, but most of the messages are irrelevant to what we want to
do.

Lincoln
	

Matthew Pocock writes:
 > Hi Lincoln,
 > 
 > 1) Is this type of lookup optional? If so, how do we find out if a 
 > server supports it (in a machine-machine way)?
 > 
 > 2) How does feature_class relate to method, category and type? If it 
 > doesn't, then how do we know what feature_class to choose (how is it 
 > published)? If it is identical to one of these, can we rename it?
 > 
 > Matthew
 > 
 > ps I look forward to looking at your ideas for what should be in DAS2
 > 
 > Lincoln Stein wrote:
 > 
 > > Hi Thomas, Brian,
 > > 
 > >  >     myDsn + "/features?feature_id=ENSE0000012425"
 > > 
 > > I'm happy with this as a speculative feature if we can add a
 > > feature_class argument:
 > > 
 > >   myDsn + "/features?feature_id=ENSE0000012425;feature_class=Exon"
 > > 
 > > Otherwise namespace issues are going to kill us.  For example,
 > > Wormbase's clones, sequences, and PCR products all share an
 > > overlapping set of identifiers.
 > > 
 > > This begs the question of how you know in advance what classes are
 > > published by the server.  For now, let's let a feature class of "Any"
 > > act as a wildcard.  Attaching object classes to identifiers will help
 > > us down the road when we want to return "real" objects in response to
 > > link requests, rather than HTML pages.
 > > 
 > > If this is acceptable, I'll add it as an appendix to the DAS/1 spec,
 > > and modify the LDAS and wormbase servers appropriately.
 > > 
 > > I am working very hard on a synthesis of the various proposals for
 > > DAS/2, and will share the working document with you all in a couple of
 > > days.
 > > 
 > > Lincoln
 > > 
 > > 
 > 
 > 

-- 
========================================================================
Lincoln D. Stein                           Cold Spring Harbor Laboratory
lstein@cshl.org			                  Cold Spring Harbor, NY

NOW HIRING BIOINFORMATICS POSTDOCTORAL FELLOWS AND PROGRAMMERS. 
PLEASE WRITE FOR DETAILS.
========================================================================