[Bioperl-l] help using SEARCH IO to parse blast results

Dave Messina David.Messina at sbc.su.se
Tue Nov 27 17:55:44 UTC 2007


Hi Alison,
As Sendu mentioned, the key bit is adding a condition to the hit loop to
limit the number of hits that are printed. I didn't test the below
extensively, but give it a try...


Dave



#!/usr/local/bin/perl -w

# Parsing BLAST reports with BioPerl's Bio::SearchIO module
# alison waller November 2007

use strict;
use warnings;
use Bio::SearchIO;

my $usage = "to run type: blast_parse_aw.pl <blast report> <# of hits>\n";
if (@ARGV != 2) { die $usage; }

my $infile  = $ARGV[0];
my $outfile = $infile . '.parsed';
my $tophit  = $ARGV[1]; # to specify in the command line how many hits
                        # to report for each query

#open(  IN, $infile )     || die "Can't open inputfile $infile! $!\n";
open( OUT, ">$outfile" ) || die "Can't open outputfile $outfile! $!\n";

my $report = new Bio::SearchIO(
    -file   => "$infile",
    -format => "blast"
);

print OUT
  "Query\tHitDesc\tHitSignif\tHitBits\tEvalue\t%id\tAlignLen\tNumIdent\tgaps\t
Qstrand\tHstrand\n";

# Go through BLAST reports one by one
while ( my $result = $report->next_result ) {
    my $i = 0;
    while (  ( $i++ < $tophit ) && (my $hit = $result->next_hit) ) {
        while ( my $hsp = $hit->next_hsp ) {

            # Print some tab-delimited data about this hit
            print OUT $result->query_name,     "\t";
            print OUT $hit->name,              "\t";
            print OUT $hit->significance,      "\t";
            print OUT $hit->bits,              "\t";
            print OUT $hsp->evalue,            "\t";
            print OUT $hsp->percent_identity,  "\t";
            print OUT $hsp->length('total'),   "\t";
            print OUT $hsp->num_identical,     "\t";
            print OUT $hsp->gaps,              "\t";
            print OUT $hsp->strand('query'),   "\t";
            print OUT $hsp->strand('hit'),     "\n";
        }
    }

    if ($i == 0) { print OUT "no hits\n"; }
}



More information about the Bioperl-l mailing list