[DAS] "inter-base" location format
Lincoln Stein
lstein@cshl.org
Tue, 23 Jul 2002 10:28:16 -0700
Hi Brian,
Interbase coordinates, also known as "0 based" and "half-open intervals" have
the following form (best viewed in a monospaced font):
0 1 2 3 4 5 7 8 9
A B C D E F G H
The coordinate indicates the space between the base pair or residue.
The following representations apply:
Range Oligo
[1,4] BCD
[1,2] B
[1,1] The space between A and B
The nice feature of 0-based coordinates is that you can talk about
insertions and deletions unambiguously, and end-start == length. You
can also refer to (-) strand features by reversing end and start,
although this probably isn't desirable.
Code written to use interbase coordinates is notably free of the +1,
-1 adjustments that you typically find in 1-based coordinates.
There's not much more to say than that. BioPerl, BioJava, DAS, GenBank, EMBL
and the biological literature all use 1-based coordinates and that produces a
very strong legacy. Jim Kent uses interbase coordinates in the UCSC browser.
Lincoln
On Monday 22 July 2002 02:05 pm, Brian King wrote:
> Lincoln,
>
> I'm interested in the "inter-base" method of
> specifying locations that you mentioned here and in
> the bioperl forum. Is there any documentation on how
> it's done and its merits? I'm giving a presentation
> on genomic XML at the I3C meeting next week. I'll be
> talking about location representation requirements,
> since a common location format is a high priority for
> data exchange. I'm also always looking for ways to
> improve AGAVE XML. I appreciate any information you
> can give me.
>
> Regards,
> Brian
>
> >> * Does a server have to deal in 1-based sequence
>
> coordinates to be
>
> > DAS1-compliant? I'm assuming yes, but it would be
>
> good to state so
> explicitly
>
> > in the spec. This is relevant to a recent thread on
>
> the Bioperl
> list:
>
>
> http://bioperl.org/pipermail/bioperl-l/2002-July/008318.html
>
> > Internally, we make much use of 0-based half-open
>
> intervals (or
> "inter-base"
>
> > coordinates as we call them, where the numbers
>
> refer to the
> positions
>
> > in-between the bases). It would be great if a
>
> server could specify
> the
>
> > coordinate system it uses. It would then be up to
>
> the client to do
> the
>
> > appropriate conversion if necessary. Perhaps
>
> something for DAS2?
>
> You are right that 1-based closed intervals are used.
> Possibly this
> is something that could be made optional in DAS2, but
> I would prefer
> to pick a single coordinate system convention and
> stick to it.
> I agree that Interbase coordinates are better than the
> current
> convention in Bioperl and elsewhere.
>
> Lincoln
>
>
>
>
> __________________________________________________
> Do You Yahoo!?
> Yahoo! Health - Feel better, live better
> http://health.yahoo.com
> _______________________________________________
> DAS mailing list
> DAS@biodas.org
> http://biodas.org/mailman/listinfo/das