[Bioperl-l] [Gmod-schema] Circular genomes in Chado/BioPerl

Aaron Mackey ajmackey at gmail.com
Mon Sep 8 19:57:50 UTC 2008


How can you handle features that may cross the origin more than once?
The modulus, though simple, seems to be only half the solution.  It
also makes it difficult to place features in the genome "by eye"
(having to do the modulus subtraction in my head), or in
sorting/filtering operations.

I have an alternative that I wondered if you considered: allow the
start/end to have an additional "circular revolution" prefix:

a typical range tuple like: 100 200 -
is thus shorthand for: 0:100 0:200 -
(i.e. both the 100 and 200 are in the same "revolution" around the genome)

and is then distinguishable from an "around the genome + 100" feature of:
1:100 0:200 -

Just an alternative to consider (if you haven't already).  I'm not
wedded to the syntax, but I wouldn't want to see new columns in GFF
just for this.  Essentially, what you want is some form of compound
polar coordinates, it seems.

-Aaron

On Mon, Sep 8, 2008 at 2:44 PM, Jim Hu <jimhu at tamu.edu> wrote:
> In discussions with GMOD about Gbrowse, we've come up with a proposal for
> handling circular genomes and features that cross the origin in such
> genomes.  This applies to lots of prokaryotic and viral genomes, and might
> be valuable for some ways of representing terminally redundant linear
> genomes.
> 1) Keep the requirement that start < end
> 2) allow end > parent feature length
> 3) parent feature gets an is_circular boolean
> 4) use modular arithmetic to calculate the real position of end on the
> parent feature.
> We'd like to do this in a way that will be consistent with Chado and BioPerl
> representation of features as much as possible (realizing that there is the
> usual interbase or not coordinate issue).  What do people think?  Lincoln is
> on board for modifying the GFF3 spec.
> Thanks!
> Jim Hu
>
> =====================================
>
> Jim Hu
>
> Associate Professor
>
> Dept. of Biochemistry and Biophysics
>
> 2128 TAMU
>
> Texas A&M Univ.
>
> College Station, TX 77843-2128
>
> 979-862-4054
>
>
> -------------------------------------------------------------------------
> This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
> Build the coolest Linux based applications with Moblin SDK & win great
> prizes
> Grand prize is a trip for two to an Open Source event anywhere in the world
> http://moblin-contest.org/redirect.php?banner_id=100&url=/
> _______________________________________________
> Gmod-schema mailing list
> Gmod-schema at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>
>



More information about the Bioperl-l mailing list