[Bioperl-l] RemoteBlast problem

Thomas J Keller kellert at ohsu.edu
Fri Oct 28 15:55:02 EDT 2005


Greetings,
I'm using perl 5.8.6 and the fink installation of bioperl 1.4.5 on an  
Apple G5 running OS X 10.4.2
running my script with a simple fasta dna sequence file:
$ bp_remote_blast2.pl -p blastx -d nr -i test.fa

Here's the error:

------------- EXCEPTION  -------------
MSG: no data for midline Query  78    
HRRPSFSACRCVLSASSVFPSRLGNNYITAAGAQVLAEGLRGNTSLQFLG  227
STACK Bio::SearchIO::blast::next_result /sw/lib/perl5/5.8.6/Bio/ 
SearchIO/blast.pm:1151
STACK toplevel /Users/kellert/Sandbox/Perlscripts/bp_remote_blast2.pl:90


It looks like Bio::SearchIO::blast
is choking on the result from Bio::Tools::Run::RemoteBlast

I lifted this from Jason's exampleL bp_remote_blast but modified it  
to use SearchIO method instead of BPLite:

use warnings;
use strict;
use vars qw($USAGE);

use Bio::Tools::Run::RemoteBlast;
use Bio::SeqIO;
use Bio::SearchIO;
use Getopt::Long;

$USAGE = "remote_blast.pl [-h] [-p prog] [-d db] [-e expect]  [-f  
seqformat] -i seqfile\n";

my ($prog, $db, $expect );

my ($sequencefile,$sequenceformat,$help) = (undef, 'fasta',undef);

&GetOptions('prog|p=s'               => \$prog,
         'db|d=s'                 => \$db,
         'expect|e=s'             => \$expect,
         'input|i=s'              => \$sequencefile,
         'format|f=s'             => \$sequenceformat,
         'help|h'                 => \$help,
         );

if( $help ) {
     exec('perldoc', $0);
     die;
}

if( !defined $prog ) {
     die($USAGE . "\n\tMust specify a valid program name ([t]blast 
[pxn])\n");
}
if( !defined $db ) {
     die($USAGE . "\n\tMust specify a db (e.g. \'nr\') to search\n");
}
if( !defined $sequencefile ) {
     die($USAGE . "\n\tMust specify a path to a sequence file.\n");
}

my $factory = new Bio::Tools::Run::RemoteBlast ('-prog'      => $prog,
                              '-data'      => $db,
                              '-expect'    => $expect,
                              'readmethod' => 'SearchIO',        #use  
SearchIO to parse
                              );

# submit_blast can only currenly handle fasta format files so I'll
# preprocess outside of the module but I'd rather be sure here

my $input;
if( $sequenceformat !~ /fasta/ ) {
     my @seqs;
     my $seqio = new Bio::SeqIO('-format' => $sequenceformat,
                    '-file'   => $sequencefile );
     while( my $seq = $seqio->next_seq() ) {
     push @seqs, $seq;
     }
     $input = \@seqs;
} else {
     $input = $sequencefile;
}
my $r = $factory->submit_blast($input);

my $v = 1;
## set to 0 to turn off Sanity check messages
print STDERR  "retrieving blasts for $input ...\n" if( $v > 0);

while ( my @rids = $factory->each_rid ) {
     foreach my $rid ( @rids ) {
         my $rc = $factory->retrieve_blast($rid);
         if( !ref($rc) ) {
             if( $rc < 0 ) {
                 $factory->remove_rid($rid);
             }
             print STDERR "." if ( $v > 0 );
             sleep 5;
             } else {
                 my $result = $rc->next_result();
                 next unless ($result);
                  print "\nQuery Name: ", $result->query_name(), "\n";
                 #save the output
                 my $filename = $result->query_name()."\.out";
                 print STDERR  "Saving result to $filename.\n" if( $v  
 > 0);
                 $factory->save_output($filename);
                 $factory->remove_rid($rid);
                  print "\nQuery Name: ", $result->query_name(), "\n";
                 while ( my $hit = $result->next_hit ) {
                     next unless ( $v > 0);
                             print "\thit name is ", $hit->name, "\n";
                             while( my $hsp = $hit->next_hsp ) {
                                   print "\t\tscore is ", $hsp- 
 >score, "\n";
                               }
                         }
                   }
             }
           }

I'm guessing either NCBI has changed it blast format and bioperl 1.4  
no longer works, or I'm missing something that should be obvious.

Help much apprecieated.

Tom K


Thomas J. Keller, Ph.D.
Director, MMI Core Facility
Oregon Health & Science University
3181 SW Sam Jackson Park Rd.
Portland, OR, USA,   97239

http://www.ohsu.edu/research/core



More information about the Bioperl-l mailing list