[Bioperl-l] methods, etc. for Bio::SearchIO on exonerate output
Gathman, Allen
agathman at semo.edu
Fri Apr 15 15:48:06 EDT 2005
Hi all --
I'm using exonerate to align ESTs with a set of genomic contigs, and I'm
trying to figure out the best way to pull information out of the output. I
wrote a little test script to see what Bio::SearchIO would get me:
use Bio::SearchIO;
open (OUT, ">$ARGV[1]");
my $searchobj=new Bio::SearchIO ( -format => 'exonerate',
-file => $ARGV[0]);
while (my $result=$searchobj->next_result() ) {
print OUT "query: " . $result->query_name(). "\n";
my @params=$result->available_parameters;
print OUT "params:@params\n";
my @stats=$result->available_statistics;
print OUT "stats:@stats\n";
while (my $hit=$result->next_hit() ) {
print OUT "hitstart: " . $hit->start('hit') . "\n";
while (my $hsp=$hit->next_hsp() ) {
print OUT "hspsstart: " . $hsp->start('hit') . "\n";
} # end hsp
} # end hit
} # end result
close OUT;
There aren't any parameters or stats returned. $hit->start works fine, and
$hsp->start works, but the hsps are the individual matches; if there's a
one-nucleotide gap, that separates two hsps, just as a real intron would.
It appears that there should be some methods or arguments applicable here
beyond those for the generic Hit and HSP objects, but I don't know what they
are. I've gone through the documentation for Bio::SearchIO::exonerate, but
I don't see what I'm looking for. For instance, the VULGAR line in the
exonerate output distinguishes between introns and gaps - is there some way
to pull them out separately in Bio::SearchIO?
In short, I want to be able to ignore small gaps and define start and end
points of exons, marking 5' and 3' splice junctions. I'd appreciate any
help on how to get at these.
Thanks --
Allen
Allen Gathman
http://cstl-csm.semo.edu/gathman
More information about the Bioperl-l
mailing list