[Bioperl-l] help using SEARCH IO to parse blast results
Dave Messina
David.Messina at sbc.su.se
Tue Nov 27 17:55:44 UTC 2007
Hi Alison,
As Sendu mentioned, the key bit is adding a condition to the hit loop to
limit the number of hits that are printed. I didn't test the below
extensively, but give it a try...
Dave
#!/usr/local/bin/perl -w
# Parsing BLAST reports with BioPerl's Bio::SearchIO module
# alison waller November 2007
use strict;
use warnings;
use Bio::SearchIO;
my $usage = "to run type: blast_parse_aw.pl <blast report> <# of hits>\n";
if (@ARGV != 2) { die $usage; }
my $infile = $ARGV[0];
my $outfile = $infile . '.parsed';
my $tophit = $ARGV[1]; # to specify in the command line how many hits
# to report for each query
#open( IN, $infile ) || die "Can't open inputfile $infile! $!\n";
open( OUT, ">$outfile" ) || die "Can't open outputfile $outfile! $!\n";
my $report = new Bio::SearchIO(
-file => "$infile",
-format => "blast"
);
print OUT
"Query\tHitDesc\tHitSignif\tHitBits\tEvalue\t%id\tAlignLen\tNumIdent\tgaps\t
Qstrand\tHstrand\n";
# Go through BLAST reports one by one
while ( my $result = $report->next_result ) {
my $i = 0;
while ( ( $i++ < $tophit ) && (my $hit = $result->next_hit) ) {
while ( my $hsp = $hit->next_hsp ) {
# Print some tab-delimited data about this hit
print OUT $result->query_name, "\t";
print OUT $hit->name, "\t";
print OUT $hit->significance, "\t";
print OUT $hit->bits, "\t";
print OUT $hsp->evalue, "\t";
print OUT $hsp->percent_identity, "\t";
print OUT $hsp->length('total'), "\t";
print OUT $hsp->num_identical, "\t";
print OUT $hsp->gaps, "\t";
print OUT $hsp->strand('query'), "\t";
print OUT $hsp->strand('hit'), "\n";
}
}
if ($i == 0) { print OUT "no hits\n"; }
}
More information about the Bioperl-l
mailing list