[Bioperl-l] BLAST Parsing
Paul Boutros
pcboutro@engmail.uwaterloo.ca
Fri, 20 Sep 2002 14:16:11 -0400 (EDT)
Hi all,
Another potential bug in BLAST parsing (SearchIO\blast.pm).
My setup:
BioPerl 1.02
Perl 5.6.1 (ActiveState)
WinXP SP1
The parser doesn't seem to be recognizing one of the lines in my blast
output file. The error is:
------------- EXCEPTION -------------
MSG: no data for midline Lambda K H
STACK Bio::SearchIO::blast::next_result
C:/Perl/site/lib/Bio/SearchIO/blast.pm:5
67
STACK toplevel parseb~1.pl:55
--------------------------------------
The offending part of the blast output file looks like this:
=========================
Sbjct: 564 cctggg 569
Lambda K H
1.37 0.711 1.31
Gapped
Lambda K H
1.37 0.711 1.31
Matrix: blastn matrix:1 -3
==========================
BLAST parameters were:
-p blastn
-d est_others
-e 0.001
-v 10
-b 10
-l Rn_GI
Minimal code is:
use Bio::SearchIO;
my $infile = $ARGV[0];
my $searchio = new Bio::SearchIO(
'-format' => 'blast',
'-file' => $infile,
);
while (my $result = $searchio->next_result()) { }
The offending part of the blast.pm file looks like this:
if( /^((Query|Sbjct):\s+(\d+)\s*)(\S+)\s+(\d+)/ ) {
$data{$2} = $4;
$len = length($1);
$self->{"\_$2"}->{'begin'} = $3 unless $self->{"_$2"}->{'be
$self->{"\_$2"}->{'end'} = $5;
} else {
$self->throw("no data for midline $_")
unless (defined $_ && defined $len);
$data{'Mid'} = substr($_,$len);
}
removing the $self->throw and replacing the unless with:
if (defined $_ && defined $len) {
$data{'Mid'} = substr($_,$len);
}
seems to be parsing correctly, but at the cost of an awful lot warnings.
I can preparse out the
Lambda K H
lines, but I'm not sure which one should be removed, or if I will also
need to remove the blank lines.
Any ideas/comments/criticism welcome.
Paul