[Bioperl-l] Parse BLAST to find mismatches
Ken Graham
kjgraham@ucdavis.edu
Wed, 29 May 2002 15:03:05 -0700
Thanks for the help Jason . That does what I wanted.
I'm still learning Perl and BioPerl so I tried to use the "seq_inds" method
of the high scoring pair to develop a better feel for BioPerl objects.
Below is a code snippet that is giving me problems. I get the following error:
Can't locate object method "seq_inds" via package
"Bio::Search::HSP::GenericHSP"
(perhaps you forgot to load "Bio::Search::HSP::GenericHSP"?) at
blast_parser_sm line 16,
<GEN1> line 2661.
I tried this code on another computer and got the same results.
Am I forgetting something?
Thanks in advance,
#!/usr/bin/perl -w
use Bio::SearchIO;
$blastfile ='/home/ken/blastFile';
$searchio = new Bio::SearchIO ('-format' => 'blast',
'-file' => $blastfile );
$result = $searchio->next_result;
$result->database_name;
while( $hit = $result->next_hit()) {
$hit_name = $hit->name ;
while ($hsp = $hit->next_hsp()) {
$hsp_percent=$hsp->percent_identity;
$hsp_len=$hsp->length();
$hsp_start = $hsp->start();
@h_ind = $hsp->seq_inds('hit', 'conserved', 1);
if ($hsp_percent > 85 and $hsp_len > 25) {
print "Hit Name is $hit_name\n";
print "Length is $hsp_len\n";
print "Starting at $hsp_start\n";
print "Matches at @h_ind\n";
print "Percent equals $hsp_percent\n";
}
}
}
> >> I'm stuck on what might be a simple problem. Using a BLAST report I
> want to
> >> find the position, on the query sequence, of each mismatch in a HSP. I can
> >> read in the BLAST report, and get properties on the HSP (such as start,
> >> percent similarity etc..) but I don't know where to find any info
> about the
> >> mismatches. I tried to use the SimpleAlign object but I just get errors.
>
> >You can call $hsp->homology_seq to determine where the mismatches are.
> >
> >The mismatches will be space so you can use the index method to locate
> >where the mismatches are. The following might get you what you want.
> >
> >my $in = new Bio::SearchIO(-file => 'BLASTFILE',
> > -format => 'blast');
> >
> >while( my $result = $in->next_result ) {
> > while( my $hit = $result->next_hit ) {
> > while( my $hsp = $hit->next_hsp ) {
> > my $hseq = $hsp->homology_string;
> > $last = 0;
> > my @a;
> > while( ($last = index($hseq, ' ', $last)) > 0 ) {
> > # last is the location of the mismatch
> > # NOTE: $last is index in string coordinates
> > # but Bio::Seq objects start with 1 not 0.
> > # What you'll probably want to do is:
> > print "base is ", substr($hsp->query_string, $last,1),
> > " index is $last\n";
> > $last++;
> > }
> > print $hit->accession, " ", join(",",@a), "\n" if(@a);
> > }
> > }
> > last;
> >}