[DAS2] DAS intro

Andrew Dalke dalke at dalkescientific.com
Tue Nov 29 00:16:17 UTC 2005


Ed Erwin:
> No.  The coordinate transformations are often more complicated than 
> simple offsets.  The coordinate space for features on one contig can 
> be 'backwards' with respect to a different contig, and the coordinate 
> space for a gene may skip over one or more gaps with respect to the 
> genomic sequence.

The /region entities in the DAS/2 spec are defined as

<REGION> (zero or more)
A top-level region on the genome (similar to the "entry points" of
the DAS/1 protocol).
     id – the URI of the sequence ID
     length – length of the sequence
     name (optional) – a human-readable label for use when referring
        to the region
     doc_href (optional) – a URL that gives additional information
        about this region

Here is an example

    <REGION id="../sequence/ctg2" length="81918" name="VolvoxContig2" />

This is a very simple definition.  As far as I can tell it does not
capture the information for, say, skipping.

How would you represent "the coordinate space for a gene [that skips]
over one or more gapes with respect to the genomic sequence" using the
current DAS/2 object model?

Or goes backwards?  I don't see anything like that.

> Also, the term 'reference frame' bugs me a bit because 'frame' always
> makes me think of 'reading frame', which is not what you intend.

Oh, I agree.  It's a bad term.  Very very few genomics people use it,
according to Google.

There's a theory, popular in usenet and in some wikis, is that experts
rarely write the details because after all they know the topic.  The
best way to get a detailed explanation is to post something in error
and wait for the corrections.  :)

					Andrew
					dalke at dalkescientific.com





More information about the DAS2 mailing list