[Bioperl-l] [Gmod-schema] Circular genomes in Chado/BioPerl

Jim Hu jimhu at tamu.edu
Tue Sep 9 16:05:59 UTC 2008


Hi Aaron,

I was thinking this would be handled by making the end=parent feature  
length x 2 + end coord.  end/parent length = number of times crosses  
origin.

Jim

On Sep 8, 2008, at 2:57 PM, Aaron Mackey wrote:

> How can you handle features that may cross the origin more than once?
> The modulus, though simple, seems to be only half the solution.  It
> also makes it difficult to place features in the genome "by eye"
> (having to do the modulus subtraction in my head), or in
> sorting/filtering operations.
>
> I have an alternative that I wondered if you considered: allow the
> start/end to have an additional "circular revolution" prefix:
>
> a typical range tuple like: 100 200 -
> is thus shorthand for: 0:100 0:200 -
> (i.e. both the 100 and 200 are in the same "revolution" around the  
> genome)
>
> and is then distinguishable from an "around the genome + 100"  
> feature of:
> 1:100 0:200 -
>
> Just an alternative to consider (if you haven't already).  I'm not
> wedded to the syntax, but I wouldn't want to see new columns in GFF
> just for this.  Essentially, what you want is some form of compound
> polar coordinates, it seems.
>
> -Aaron
>
> On Mon, Sep 8, 2008 at 2:44 PM, Jim Hu <jimhu at tamu.edu> wrote:
>> In discussions with GMOD about Gbrowse, we've come up with a  
>> proposal for
>> handling circular genomes and features that cross the origin in such
>> genomes.  This applies to lots of prokaryotic and viral genomes,  
>> and might
>> be valuable for some ways of representing terminally redundant linear
>> genomes.
>> 1) Keep the requirement that start < end
>> 2) allow end > parent feature length
>> 3) parent feature gets an is_circular boolean
>> 4) use modular arithmetic to calculate the real position of end on  
>> the
>> parent feature.
>> We'd like to do this in a way that will be consistent with Chado  
>> and BioPerl
>> representation of features as much as possible (realizing that  
>> there is the
>> usual interbase or not coordinate issue).  What do people think?   
>> Lincoln is
>> on board for modifying the GFF3 spec.
>> Thanks!
>> Jim Hu
>>
>> =====================================
>>
>> Jim Hu
>>
>> Associate Professor
>>
>> Dept. of Biochemistry and Biophysics
>>
>> 2128 TAMU
>>
>> Texas A&M Univ.
>>
>> College Station, TX 77843-2128
>>
>> 979-862-4054
>>
>>
>> -------------------------------------------------------------------------
>> This SF.Net email is sponsored by the Moblin Your Move Developer's  
>> challenge
>> Build the coolest Linux based applications with Moblin SDK & win  
>> great
>> prizes
>> Grand prize is a trip for two to an Open Source event anywhere in  
>> the world
>> http://moblin-contest.org/redirect.php?banner_id=100&url=/
>> _______________________________________________
>> Gmod-schema mailing list
>> Gmod-schema at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>>
>>

=====================================
Jim Hu
Associate Professor
Dept. of Biochemistry and Biophysics
2128 TAMU
Texas A&M Univ.
College Station, TX 77843-2128
979-862-4054





More information about the Bioperl-l mailing list