[Bioperl-l] help using SEARCH IO to parse blast results
alison waller
alison.waller at utoronto.ca
Mon Nov 26 21:06:35 UTC 2007
Hello all,
It's the usual story, I'm an engineer turned biologist who now needs help
with bioinformatics so I can analyze huge amounts of data to finish my
thesis.
I am trying to write a script that will parse large blast files (usually
blastx) I also want to be able to specify how many hits I want to report
information on.
Most of the time I will only want information on the top hit, but I want to
have the flexibility to obtain information on say the top5. I am pretty
sure I have done this wrong, any advice on how to correct my script to do
this, would be great.
Thanks so much,
Alison
#!/usr/local/bin/perl -w
# Parsing BLAST reports with BioPerl's Bio::SearchIO module
# alison waller November 2007
use strict;
use warnings;
use Bio::SearchIO;
# to run type: blast_parse_aw.pl input.txt #of hits
my $infile =shift(@ARGV);
my $outfile ="$ARGV[0].parsed";
my $tophit = $ARGV[1]; # I want to specify in the command line how many hits
to report for each query
open (IN,"$ARGV[0]") || die "Can't open inputfile $ARGV[0]! $!\n";
open (OUT,">$outfile");
$report = new Bio::SearchIO(
-file=>"$inFile",
-format => "blast");
print
"Query\tHitDesc\tHitSignif\tHitBits\tEvalue\t%id\tAlignLen\tNumIdent\tgaps\t
Qstrand\tHstrand\n";
# Go through BLAST reports one by one
while($result = $report->next_result)
{
if ($top_hit=$result->next_hit) # this might be wrong - I want to
specify how many hits to print results for
# Print some tab-delimited data about this hit
{
print $result->query_name, "\t";
print $hit->description, "\t";
print $hit->significance, "\t";
print $hit->bits,"\t";
print $hsp->evalue, "\t";
print $hsp->percent_identity, "\t";
print $hsp->length('total'),"\t";
print $hsp->num_identical,"\t";
print $hsp->gaps,"\t";
print $hsp->strand('query'),"\t";
print $hsp->strand('hit'), "\n";
}
else print "no hits\n";
}
******************************************
Alison S. Waller M.A.Sc.
Doctoral Candidate
awaller at chem-eng.utoronto.ca
416-978-4222 (lab)
Department of Chemical Engineering
Wallberg Building
200 College st.
Toronto, ON
M5S 3E5
More information about the Bioperl-l
mailing list