[DAS] "inter-base" location format

Lincoln Stein lstein@cshl.org
Tue, 23 Jul 2002 10:28:16 -0700


Hi Brian,

Interbase coordinates, also known as "0 based" and "half-open intervals" have 
the following form (best viewed in a monospaced font):

 0 1 2 3 4 5 7 8 9 
  A B C D E F G H

The coordinate indicates the space between the base pair or residue.
The following representations apply:

  Range     Oligo

  [1,4]     BCD
  [1,2]     B
  [1,1]     The space between A and B

The nice feature of 0-based coordinates is that you can talk about
insertions and deletions unambiguously, and end-start == length.  You
can also refer to (-) strand features by reversing end and start,
although this probably isn't desirable.

Code written to use interbase coordinates is notably free of the +1,
-1 adjustments that you typically find in 1-based coordinates.

There's not much more to say than that.  BioPerl, BioJava, DAS, GenBank, EMBL 
and the biological literature all use 1-based coordinates and that produces a 
very strong legacy.  Jim Kent uses interbase coordinates in the UCSC browser.

Lincoln


On Monday 22 July 2002 02:05 pm, Brian King wrote:
> Lincoln,
>
> I'm interested in the "inter-base" method of
> specifying locations that you mentioned here and in
> the bioperl forum.  Is there any documentation on how
> it's done and its merits?  I'm giving a presentation
> on genomic XML at the I3C meeting next week.  I'll be
> talking about location representation requirements,
> since a common location format is a high priority for
> data exchange.  I'm also always looking for ways to
> improve AGAVE XML.  I appreciate any information you
> can give me.
>
> Regards,
> Brian
>
> >> * Does a server have to deal in 1-based sequence
>
> coordinates to be
>
>  > DAS1-compliant? I'm assuming yes, but it would be
>
> good to state so
> explicitly
>
>  > in the spec. This is relevant to a recent thread on
>
> the Bioperl
> list:
>
>
> http://bioperl.org/pipermail/bioperl-l/2002-July/008318.html
>
>  > Internally, we make much use of 0-based half-open
>
> intervals (or
> "inter-base"
>
>  > coordinates as we call them, where the numbers
>
> refer to the
> positions
>
>  > in-between the bases). It would be great if a
>
> server could specify
> the
>
>  > coordinate system it uses. It would then be up to
>
> the client to do
> the
>
>  > appropriate conversion if necessary. Perhaps
>
> something for DAS2?
>
> You are right that 1-based closed intervals are used.
> Possibly this
> is something that could be made optional in DAS2, but
> I would prefer
> to pick a single coordinate system convention and
> stick to it.
> I agree that Interbase coordinates are better than the
> current
> convention in Bioperl and elsewhere.
>
> Lincoln
>
>
>
>
> __________________________________________________
> Do You Yahoo!?
> Yahoo! Health - Feel better, live better
> http://health.yahoo.com
> _______________________________________________
> DAS mailing list
> DAS@biodas.org
> http://biodas.org/mailman/listinfo/das