[DAS] Re: Protein DAS and C. elegans

Lincoln Stein lstein at cshl.org
Sat May 24 12:33:49 EDT 2003


Hi Tony,

Fair enough.  FYI, the draft GFF3 standard that I'm working with the Sequence 
Ontology group on provides two ways to link pieces of a feature together:

	1) all the pieces of a discontinuous feature like a protein domain or a 
		CDS is given the same feature ID.

	2) named subparts of a feature, such as exons belonging to a transcript,
		are grouped using the parent field (equivalent to the current group
		field).  Grouping is fully recursive.

For DAS/2 I am thinking that the simplest way to handle this is to remove the 
<segment> concept completely and just serve a list of features:

	<feature id="id" display_name="name" type="type" version="version">
		<position href="url" start="start" end="end" />
		<position href="url" start="start" end="end" />
		<parent href="url" />
		<info>
			<type-specific tags/>
		</info>
	</feature>

Discontinuous features simply have multiple <position> tags.  Both grouping 
and the coordinate system itself is controlled by a <parent> href, which is 
simply a local or remote URL.  This lets people attach subfeatures to 
features referred to by other servers in a very RESTful manner.

How does this sound to you?  To Julian?

Lincoln

On Saturday 24 May 2003 03:58 am, Tony Cox wrote:
> Lincoln,
>
> I think feature linking by ID is common becasue it is (a) the simplest
> thing to do from the drawing code point of view and (b) because it is not
> obvious/clear how to manipulate the <TARGET> and  <GROUP> tag contents and
> attributes when loading data into an LDAS server.
>
> Maybe it would be nice to add another data field to allow the explicit
> setting of the <GROUP> and/or <TAGET> tags - or indeed any of the other
> tags down there in the FEATURE object that may be useful like <NOTE>.
>
> > Lincoln,
> >
> > this is indeed what I am doing, and I very much welcome the move to
> > modify the spec to allow this explicitly (from what I can make out it
> > isn't disallowed anyway).
> > However I feel that I have to make absolutely clear what Tony said about
> > it because I think I have been ever so slightly misunderstood.
>
> I don't know how....
>
> > So far as
> > I can recall from what Tony said no other Protein DAS server is doing
> > this, only mine, but he said "We routinely use same IDs to 'link'
> > features into a structure.... If Ensembl encounters a list of exon
> > features all with the same ID it groups them into a transcript." I think
> > perhaps referring to non-protein DAS.
>
> Look Julian, I don't draw a distinction between "protein DAS", "non-protein
> DAS" or any other perceived 'flavour' of DAS.  It is all about serving
> annotations on a sequence. Proteins are not a special case. Grouping by
> feature ID is simply expedient whether it is to link exons in DNA or
> domains on a swissprot sequence. The main thing is that the spec is able to
> allow people to get their data out there easily and not bog them down in
> semantic quicksand.
>
> Tony
>
> > Anyway I'm sure you'll get the full story from him.
> >
> > Julian.
> >
> > On Sat, 2003-05-24 at 07:18, Lincoln Stein wrote:
> > > Hi Tony,
> > >
> > > Julian tells me that the various protein DAS servers are serving
> > > domains
>
> that
>
> > > are on discontinuous regions of the genome by giving each feature the
>
> same
>
> > > feature ID rather than linking them by using the same group.  This
>
> strategy
>
> > > makes sense, and is in fact what the GFF3 spec calls for, but isn't
> > > what
>
> the
>
> > > DAS/1 spec says.  The question is:
> > >
> > > - How widespread is this practice?  Should I modify the 1.X spec
> > > retrospecticely to allow it?
> > >
> > > Lincoln
> > >
> > > --
> > > =======================================================================
> > >= Lincoln D. Stein                           Cold Spring Harbor
> > > Laboratory lstein at cshl.org                   Cold Spring Harbor, NY
> > > =======================================================================
> > >=
> >
> > --
> > ~~~~~~~~~~~~~~~~~~
> > Julian Gough
> > Department of Structural Biology
> > School of Medicine, Fairchild bldg D109
> > Stanford University, CA 94305-5126, U.S.A.
> > Tel: +1 650 7250754
> > Fax: +1 650 7238464

-- 
Lincoln Stein
lstein at cshl.org
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)



More information about the DAS mailing list