[Bioperl-l] Why does Bio::DB::GFF::Feature::gff3_string swap start and stop coordinates??
johnsonm at gmail.com
Mon May 21 20:48:52 UTC 2007
Check the test data for Glimmer2 and Glimmer3. They both predict one
large gene, I'd guess covering most of the sequence, in frame +1.
That's probably a bogus prediction, but that's not up to the parser to
decide. I hadn't noticed it until recently.
I sent a patch via bugzilla to swap the coordinates if start > end and
strand > 0.
On 5/21/07, Chris Fields <cjfields at uiuc.edu> wrote:
> On May 16, 2007, at 2:11 PM, Mark Johnson wrote:
> > On 5/8/07, Chris Fields <cjfields at uiuc.edu> wrote:
> >> I believe all seqfeature location coordinates are designed to have
> >> start < stop for consistency; in cases where the strand matters (CDS,
> >> gene, etc.) then the strand is set to 1 or -1. When start > stop,
> >> the two are reversed and the strand is flipped; at least that's the
> >> way locations are set up in BioPerl.
> >> chris
> > Oh yeah? I always tend to ensure that (start < stop), regardless
> > of strand, when working with sequence features...the other day, I
> > caught Glimmer2 emitting a prediction on the plus strand with start >
> > stop. I was going to work up a patch for the parser, but I wonder,
> > should I just force everything to start < stop? Or only predictions
> > on the plus strand? Should all the parsers for all the ab initio
> > predictors ensure they emit features with coordinates like this?
> Odd that it would predict a start > stop on the plus strand, though
> it may be corrected in Glimmer3. Does the same prediction show up in
More information about the Bioperl-l