[Bioperl-l] methods, etc. for Bio::SearchIO on exonerate output

Gathman, Allen agathman at semo.edu
Fri Apr 15 15:48:06 EDT 2005


Hi all -- 

 

I'm using exonerate to align ESTs with a set of genomic contigs, and I'm
trying to figure out the best way to pull information out of the output.  I
wrote a little test script to see what Bio::SearchIO would get me:

use Bio::SearchIO;
open (OUT, ">$ARGV[1]");

my $searchobj=new Bio::SearchIO ( -format => 'exonerate',
                                  -file => $ARGV[0]);

while (my $result=$searchobj->next_result() ) {
     print OUT "query: " . $result->query_name(). "\n";
     my @params=$result->available_parameters; 
     print OUT "params:@params\n";
     my @stats=$result->available_statistics; 
     print OUT "stats:@stats\n";
     while (my $hit=$result->next_hit() ) {
          print OUT "hitstart: " . $hit->start('hit') . "\n";
          while (my $hsp=$hit->next_hsp() ) {
               print OUT "hspsstart: " . $hsp->start('hit') . "\n";
          } # end hsp
     } # end hit
} # end result
close OUT;
 

There aren't any parameters or stats returned.  $hit->start works fine, and
$hsp->start works, but the hsps are the individual matches; if there's a
one-nucleotide gap, that separates two hsps, just as a real intron would.  

 

It appears that there should be some methods or arguments applicable here
beyond those for the generic Hit and HSP objects, but I don't know what they
are.  I've gone through the documentation for Bio::SearchIO::exonerate, but
I don't see what I'm looking for.  For instance, the VULGAR line in the
exonerate output distinguishes between introns and gaps - is there some way
to pull them out separately in Bio::SearchIO?  

 

In short, I want to be able to ignore small gaps and define start and end
points of exons, marking 5' and 3' splice junctions.  I'd appreciate any
help on how to get at these. 

 

Thanks --

 

Allen

 

Allen Gathman

http://cstl-csm.semo.edu/gathman

 



More information about the Bioperl-l mailing list