[Bioperl-l] RemoteBlast problem
Thomas J Keller
kellert at ohsu.edu
Fri Oct 28 15:55:02 EDT 2005
Greetings,
I'm using perl 5.8.6 and the fink installation of bioperl 1.4.5 on an
Apple G5 running OS X 10.4.2
running my script with a simple fasta dna sequence file:
$ bp_remote_blast2.pl -p blastx -d nr -i test.fa
Here's the error:
------------- EXCEPTION -------------
MSG: no data for midline Query 78
HRRPSFSACRCVLSASSVFPSRLGNNYITAAGAQVLAEGLRGNTSLQFLG 227
STACK Bio::SearchIO::blast::next_result /sw/lib/perl5/5.8.6/Bio/
SearchIO/blast.pm:1151
STACK toplevel /Users/kellert/Sandbox/Perlscripts/bp_remote_blast2.pl:90
It looks like Bio::SearchIO::blast
is choking on the result from Bio::Tools::Run::RemoteBlast
I lifted this from Jason's exampleL bp_remote_blast but modified it
to use SearchIO method instead of BPLite:
use warnings;
use strict;
use vars qw($USAGE);
use Bio::Tools::Run::RemoteBlast;
use Bio::SeqIO;
use Bio::SearchIO;
use Getopt::Long;
$USAGE = "remote_blast.pl [-h] [-p prog] [-d db] [-e expect] [-f
seqformat] -i seqfile\n";
my ($prog, $db, $expect );
my ($sequencefile,$sequenceformat,$help) = (undef, 'fasta',undef);
&GetOptions('prog|p=s' => \$prog,
'db|d=s' => \$db,
'expect|e=s' => \$expect,
'input|i=s' => \$sequencefile,
'format|f=s' => \$sequenceformat,
'help|h' => \$help,
);
if( $help ) {
exec('perldoc', $0);
die;
}
if( !defined $prog ) {
die($USAGE . "\n\tMust specify a valid program name ([t]blast
[pxn])\n");
}
if( !defined $db ) {
die($USAGE . "\n\tMust specify a db (e.g. \'nr\') to search\n");
}
if( !defined $sequencefile ) {
die($USAGE . "\n\tMust specify a path to a sequence file.\n");
}
my $factory = new Bio::Tools::Run::RemoteBlast ('-prog' => $prog,
'-data' => $db,
'-expect' => $expect,
'readmethod' => 'SearchIO', #use
SearchIO to parse
);
# submit_blast can only currenly handle fasta format files so I'll
# preprocess outside of the module but I'd rather be sure here
my $input;
if( $sequenceformat !~ /fasta/ ) {
my @seqs;
my $seqio = new Bio::SeqIO('-format' => $sequenceformat,
'-file' => $sequencefile );
while( my $seq = $seqio->next_seq() ) {
push @seqs, $seq;
}
$input = \@seqs;
} else {
$input = $sequencefile;
}
my $r = $factory->submit_blast($input);
my $v = 1;
## set to 0 to turn off Sanity check messages
print STDERR "retrieving blasts for $input ...\n" if( $v > 0);
while ( my @rids = $factory->each_rid ) {
foreach my $rid ( @rids ) {
my $rc = $factory->retrieve_blast($rid);
if( !ref($rc) ) {
if( $rc < 0 ) {
$factory->remove_rid($rid);
}
print STDERR "." if ( $v > 0 );
sleep 5;
} else {
my $result = $rc->next_result();
next unless ($result);
print "\nQuery Name: ", $result->query_name(), "\n";
#save the output
my $filename = $result->query_name()."\.out";
print STDERR "Saving result to $filename.\n" if( $v
> 0);
$factory->save_output($filename);
$factory->remove_rid($rid);
print "\nQuery Name: ", $result->query_name(), "\n";
while ( my $hit = $result->next_hit ) {
next unless ( $v > 0);
print "\thit name is ", $hit->name, "\n";
while( my $hsp = $hit->next_hsp ) {
print "\t\tscore is ", $hsp-
>score, "\n";
}
}
}
}
}
I'm guessing either NCBI has changed it blast format and bioperl 1.4
no longer works, or I'm missing something that should be obvious.
Help much apprecieated.
Tom K
Thomas J. Keller, Ph.D.
Director, MMI Core Facility
Oregon Health & Science University
3181 SW Sam Jackson Park Rd.
Portland, OR, USA, 97239
http://www.ohsu.edu/research/core
More information about the Bioperl-l
mailing list