[Bioperl-l] [Gmod-schema] Circular genomes in Chado/BioPerl

Smithies, Russell Russell.Smithies at agresearch.co.nz
Tue Sep 9 20:46:26 UTC 2008


Excuse my ignorance (I'm not a biologist) but is it biologically possible/likely for a gene or feature to wrap more than once around a genome?
Anyone got an example?


Russell Smithies 

Bioinformatics Applications Developer 

Invermay  Research Centre 
Puddle Alley, 
Mosgiel, 
New Zealand 





> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-
> bio.org] On Behalf Of Lincoln Stein
> Sent: Wednesday, 10 September 2008 5:53 a.m.
> To: Aaron Mackey
> Cc: GMOD Schema List; Jim Hu; Roy Welch; bioperl-l at bioperl.org; Mike Gribskov
> Subject: Re: [Bioperl-l] [Gmod-schema] Circular genomes in Chado/BioPerl
> 
> It seems to me that the proposed modulus syntax handles multiple
> revolutions. Consider a 100 bp genome (to make it simple) and a feature that
> starts at 50, goes around twice, and ends at position 60:
> 
>   start = 50
>   end  = 260
> 
> length = end - start + 1
> revolutions = int (length/genome)
> stop position = length % genome + 1
> 
> Lincoln
> 
> On Mon, Sep 8, 2008 at 3:57 PM, Aaron Mackey <ajmackey at gmail.com> wrote:
> 
> > How can you handle features that may cross the origin more than once?
> > The modulus, though simple, seems to be only half the solution.  It
> > also makes it difficult to place features in the genome "by eye"
> > (having to do the modulus subtraction in my head), or in
> > sorting/filtering operations.
> >
> > I have an alternative that I wondered if you considered: allow the
> > start/end to have an additional "circular revolution" prefix:
> >
> > a typical range tuple like: 100 200 -
> > is thus shorthand for: 0:100 0:200 -
> > (i.e. both the 100 and 200 are in the same "revolution" around the genome)
> >
> > and is then distinguishable from an "around the genome + 100" feature of:
> > 1:100 0:200 -
> >
> > Just an alternative to consider (if you haven't already).  I'm not
> > wedded to the syntax, but I wouldn't want to see new columns in GFF
> > just for this.  Essentially, what you want is some form of compound
> > polar coordinates, it seems.
> >
> > -Aaron
> >
> > On Mon, Sep 8, 2008 at 2:44 PM, Jim Hu <jimhu at tamu.edu> wrote:
> > > In discussions with GMOD about Gbrowse, we've come up with a proposal for
> > > handling circular genomes and features that cross the origin in such
> > > genomes.  This applies to lots of prokaryotic and viral genomes, and
> > might
> > > be valuable for some ways of representing terminally redundant linear
> > > genomes.
> > > 1) Keep the requirement that start < end
> > > 2) allow end > parent feature length
> > > 3) parent feature gets an is_circular boolean
> > > 4) use modular arithmetic to calculate the real position of end on the
> > > parent feature.
> > > We'd like to do this in a way that will be consistent with Chado and
> > BioPerl
> > > representation of features as much as possible (realizing that there is
> > the
> > > usual interbase or not coordinate issue).  What do people think?  Lincoln
> > is
> > > on board for modifying the GFF3 spec.
> > > Thanks!
> > > Jim Hu
> > >
> > > =====================================
> > >
> > > Jim Hu
> > >
> > > Associate Professor
> > >
> > > Dept. of Biochemistry and Biophysics
> > >
> > > 2128 TAMU
> > >
> > > Texas A&M Univ.
> > >
> > > College Station, TX 77843-2128
> > >
> > > 979-862-4054
> > >
> > >
> > > -------------------------------------------------------------------------
> > > This SF.Net email is sponsored by the Moblin Your Move Developer's
> > challenge
> > > Build the coolest Linux based applications with Moblin SDK & win great
> > > prizes
> > > Grand prize is a trip for two to an Open Source event anywhere in the
> > world
> > > http://moblin-contest.org/redirect.php?banner_id=100&url=/
> > > _______________________________________________
> > > Gmod-schema mailing list
> > > Gmod-schema at lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/gmod-schema
> > >
> > >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> 
> 
> 
> --
> Lincoln D. Stein
> 
> Ontario Institute for Cancer Research
> 101 College St., Suite 800
> Toronto, ON, Canada M5G0A3
> 416 673-8514
> Assistant: Stacey Quinn <Stacey.Quinn at oicr.on.ca>
> 
> Cold Spring Harbor Laboratory
> 1 Bungtown Road
> Cold Spring Harbor, NY 11724 USA
> (516) 367-8380
> Assistant: Sandra Michelsen <michelse at cshl.edu>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================




More information about the Bioperl-l mailing list