[Bioperl-l] Problems with circular genomes

Jim Hu jimhu at TAMU.EDU
Wed Sep 14 21:31:49 EDT 2005


Most bacterial and many viral genomes are circular. Some, like T4 are  
physically linear but circularly permuted, giving a circular genetic  
map.

is_circular is a flag that is settable and gettable (I think) in  
BioPerlBio::PrimarySeqI. Although Lincoln emailed me that he thinks  
there is a workaround for this for GBrowse, I haven't been able to  
figure out how to be able to scroll the viewer around the circle.  
Worse, I have no clue about how to handle features that cross the  
circle junction.  It's not clear to me how it can be done within the  
current GFF3 format.

It's not enough to be able to specify a circular genome, one has to  
be able to specify which way a feature goes around the circle. Note  
that one can NOT assume that features will always go the short way  
around... the late mRNAs for many viruses are very, very long, and a   
transposon inserted in a small plasmid can be bigger than the parent  
plasmid.  It seems to me that either an additional column should be  
used for direction (right or left relative to start), or the  
constraints on start and end should be relaxed so that if start>end  
and is_circular is true the feature goes around the circle across the  
junction clockwise.

This is also a problem in the lightweight annotation file format for  
external uploads, where there are only three required columns and the  
start..end relationship is used to specify strand, not direction.

I suspect that this has been addressed before, but I wasn't able to  
figure it out from the archives or the docs...but then I'm not a  
programming wizard by a long shot. I did find this message on Bio-SQL  
archive (via Google) from Hilmar Lapp:

http://open-bio.org/pipermail/biosql-l/2005-June/000859.html

But I didn't see a resolution.
=====================================
Jim Hu
Associate Professor and Associate Head for Graduate Programs
Dept. of Biochemistry and Biophysics
2128 TAMU
Texas A&M Univ.
College Station, TX 77843-2128
979-862-4054




More information about the Bioperl-l mailing list