[Bioperl-l] Why does Bio::DB::GFF::Feature::gff3_string swap start and stop coordinates??

Mark Johnson johnsonm at gmail.com
Tue May 22 18:04:31 UTC 2007


Yes, Glimmer3 outputs the length of the input sequence.  I don't
believe Glimmer2 does.

> The most complete file format to parse seems to be the details file;
> it contains the sequence length:
>
>  >BCTDNA
> Sequence length = 29940

> Since the parser currently only parses predict files, you could
> optionally supply the parser with the seq length and emit a warning
> if seqfeatures requiring it are produced, such as the sporadic ones
> which wrap around.  If one were using the bioperl-run module this
> could be automated a bit by passing the seq length in to the parser
> object by adding the seq length to the constructor argument list.

I think we can spot wrap-around genes easily enough without knowing
the length of the input sequence.  Having it just means we can perform
a sanity check or two, such as making sure 'wraparound' genes are
within N bases of the end of the input sequence.  Any suggestions on a
good default value for N?

Parsing both output files for glimmer3 will be a little tricky.  The
constructor for Bio::Tools::Glimmer calls $class->SUPER::new(@args);,
which hits the constructor for Bio::Tools::AnalysisResult, which does
the same thing.  It all ends up in Bio::Root::IO::_initialize_io,
which grabs the -file arg and opens it.  So, either let, Bio::Root::IO
handle -file and have Bio::Tools::Glimmer handle, say -detail file, or
have Bio::Tools::Glimmer just implement   intialize_io() and hopefully
that will fly..



More information about the Bioperl-l mailing list