[Bioperl-l] Why does Bio::DB::GFF::Feature::gff3_string swap start and stop coordinates??

Chris Fields cjfields at uiuc.edu
Mon May 21 20:56:50 UTC 2007


On May 21, 2007, at 3:35 PM, Chris Fields wrote:

> On May 16, 2007, at 2:11 PM, Mark Johnson wrote:
>
>> On 5/8/07, Chris Fields <cjfields at uiuc.edu> wrote:
>>> I believe all seqfeature location coordinates are designed to have
>>> start < stop for consistency; in cases where the strand matters  
>>> (CDS,
>>> gene, etc.) then the strand is set to 1 or -1.  When start > stop,
>>> the two are reversed and the strand is flipped; at least that's the
>>> way locations are set up in BioPerl.
>>>
>>> chris
>>
>>     Oh yeah?  I always tend to ensure that (start < stop), regardless
>> of strand, when working with sequence features...the other day, I
>> caught Glimmer2 emitting a prediction on the plus strand with start >
>> stop.  I was going to work up a patch for the parser, but I wonder,
>> should I just force everything to start < stop?  Or only predictions
>> on the plus strand?  Should all the parsers for all the ab initio
>> predictors ensure they emit features with coordinates like this?
>
> Odd that it would predict a start > stop on the plus strand, though
> it may be corrected in Glimmer3.  Does the same prediction show up in
> Glimmer3?
>
> chris

... and I see that it does (per your bug report).  The next thing to  
ask is how often these odd Glimmer hits occur and whether others have  
seen the same thing.  Maybe there's an explanation (bug, etc) but I  
can't immediately think of anything that makes sense unless it's  
running the reverse of the + strand as a control for some reason.

chris



More information about the Bioperl-l mailing list