[Bioperl-l] parsing blast to GFF
Charles Hauser
chauser@duke.edu
21 Nov 2002 17:13:51 -0500
All,
Trying to learn how to use bioperl and at the same time parse blast data
to GFF.
I can't quite get my head around how to output ONLY the features I want.
Modifying the script Jason posted a couple of days ago (see below) I
generate the following output:
20021010.8247.2 BLASTX_sc similarity 825 1145 . + . Target "Protein:NR_SC:PIR-S55962"
Questions:
(1) I can't seem to recover the frame,If I uncomment the line:
-add_tag_value => {'Frame', $hsp->frame()}
I get the error:
Can't locate object method "report_type" via package
"Bio::Search::HSP::GenericHSP" (perhaps you forgot to load
"Bio::Search::HSP::GenericHSP"?) at
/usr/lib/perl5/site_perl/5.6.1/Bio/Search/HSP/GenericHSP.pm line 633,
<FH> line 134.
(2) none of the 'add_tag_value' features are printing?
(3) not clear what the decimal in: ' . + . ' represent. The '+' is
reading frame, I believe.
regards,
Charles
#!/usr/bin/perl
use strict;
use warnings;
use Bio::SearchIO;
use Bio::Tools::GFF;
use Math::BigFloat;
my $filename = shift;
if( $filename =~ /\.gz/ ) {
open(FH, "zcat $filename |") || die("cannot run zcat on $filename:
$!");
} else {
open(FH, $filename) || die("cannot open $filename: $!");
}
my $in = new Bio::SearchIO(-fh => \*FH,
-format => 'blast');
my $out = new Bio::Tools::GFF(-file => ">nr_blastx.gff");
my $evalue_cutoff = Math::BigFloat->new('1e-15');
while( my $r = $in->next_result ) {
while( my $hit = $r->next_hit ) {
while( my $hsp = $hit->next_hsp ) {
my $evalue = $hsp->evalue;
# deal with time when NCBI blast doesn't report 1e-200 but e-200
$evalue =~ s/^e/1e/;
next if Math::BigFloat->new($evalue) > $evalue_cutoff;
my $feature = new Bio::SeqFeature::Generic(
-seqname => $r->query_name(),
-source => 'BLASTX_sc',
-primary => 'similarity',
-start => $hsp->query->start,
-end => $hsp->query->end,
-strand => $hsp->query->strand,
-tag => {'Target', sprintf("Protein:%s",$hit->accession || $hit->name)},
-add_tag_value => {'Target', sprintf("%.2f",$hsp->percent_identity)},
-add_tag_value => {'Target', $hsp->hit->start},
-add_tag_value => {'Target', $hsp->hit->end},
# -add_tag_value => {'Frame', $hsp->frame()},
-add_tag_value => {'note', $hit->description}
);
$out->write_feature($feature);
}
}
}