[Bioperl-l] Bio::Tools::Fgenesh bug? and fix?

Christopher Dwan chris at dwan.org
Tue Jul 11 02:06:41 UTC 2006


I'm not surprised that there are parts that don't work right, I coped  
genscan.pm and made the absolute minimal changes required to get what  
I needed working.  Haven't touched it since.

Please feel free to do what needs to be done, and sorry about the mess.

-Chris Dwan

On Jul 10, 2006, at 8:25 PM, Cook, Malcolm wrote:

> I am finding the Bio::Tools::Fgenesh parser to incorrectly handle the
> feature coordinates on - strand predictions.
>
> In particular, start & end are deliberately reversed if the strand is
> '-'.
>
> I guess this was a holdover from Genscan.pm and wasn't really tested
> !?!?!
>
> Or, perhaps fgenesh v 2.4 which I am running has different output in
> this respect compared to the version 2.0, against which this module  
> was
> written.
>
> Or, perhaps my understanding is blotto (known to happen).
>
> Does anyone know for sure?
>
> If I comment out selected lines...
>
> #	    if($predobj->strand() == 1) {
> 		$predobj->start($start);
> 		$predobj->end($end);
> #	    } else {
> #		$predobj->end($start);
> #		$predobj->start($end);
> #	    }
>
> ... then GFF produced by my naive fgenesh2gff script below is correct
> (at least w.r.t. strand and coordinates - GFF compatibility purists
> might wince).
>
> Should I commit this change to head?
>
>
> Malcolm Cook
> Database Applications Manager, Bioinformatics
> Stowers Institute for Medical Research
>
>
>
> #!/usr/bin/env perl
>
> # fgenesh2gff
> # PURPOSE: parse fgenesh output into gff
> # USAGE: fgenesh fish somefish.dna | fgenesh2gff >
> somefish.dna.fgenesh.gff
>
> use strict;
> use warnings;
> use Bio::Tools::Fgenesh;	
> use Bio::FeatureIO;
>
> # Remaining options should name files to process, but if none, process
> # standard input:
> @ARGV = ('-') unless @ARGV;	
> my $fgenesh = Bio::Tools::Fgenesh->new(-fh => \*ARGV);
>
> my $featureout =   new Bio::Tools::GFF(
> 				       -gff_version => 2, #whatever ;)
> 				      );
> my $IDNUM = 0;
> while (my $gene = $fgenesh->next_prediction()) {
>   my $ID =  "fgenesh" . ++ $IDNUM;
>   $gene->add_tag_value('ID', $ID);
>   $featureout->write_feature($gene);
>   foreach ($gene->exons()) {
>     $_->add_tag_value('Parent', $ID);
>     $_->seq_id($gene->seq_id);
>     $featureout->write_feature($_);
>   }
> }
> $fgenesh->close();
>
> exit 0;
>




More information about the Bioperl-l mailing list