[Bioperl-l] Using bioperl to convert gene predictions to gff

Torsten Seemann torsten.seemann at infotech.monash.edu.au
Fri May 12 04:29:37 UTC 2006


Mark,

> I'd like to reformat gene predictions from several different programs
> (genscan, glimmerhmm, fgenesh) to gff format. I know bioperl can parse the
> output from these and other predictors and that it can export into GFF. But
> I'm not clear on how to string the two together.
> Can anyone point me at any example code?

The parser module for the gene predictions generally allow you to 
iterate through the predicted genes. Each prediction is usually returned 
as a Bio::SeqFeatureI-derived object. Those objects have a gff_string() 
method to print them as GFF.

So something as simple as this *may* work:

use Bio::Tools::Glimmer;
my $parser = new Bio::Tools::Glimmer(-file => 'glimmer.out');
while(my $gene = $parser->next_prediction) {
   print $gene->gff_string;
}

If you want separate GFF lines for each exon, you'll have to do another 
loop over $gene->exons() etc each of which are luckily also 
Bio::SeqFeatures!

Or if want to modify some of the GFF columns first, eg. the source tag, 
just do $gene->source_tag('mynewtag') before printing it.

Hope this helps,

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia




More information about the Bioperl-l mailing list