[Bioperl-l] Trouble using RemoteBlast.pm

Chris Fields cjfields at uiuc.edu
Wed Jan 18 16:17:49 EST 2006


I have had the same problem using a script I wrote.  It worked until ~4 days
ago.  Luckily, I had saved a copy of some of my old searches in a temp
folder so I can compare them.

I noticed that if I just save the output using:

$factory->save_output('temp.out');

it works (just like Barry's script), but if I have the following in a loop
(like in RemoteBlast POD), it craps out:

while ( my @rids = $factory->each_rid ) {
	foreach my $rid ( @rids ) {
		my $rc = $factory->retrieve_blast($rid);	
		# if RID is not present
		if( !ref($rc) ) {
			# remove if RID is bad (error)
			if( $rc < 0 ) {
				$factory->remove_rid($rid);
			}
			print STDERR "." if ( $v > 0 ); 
			sleep 5; 
		} else { # RID is returned
			my $result = $rc->next_result();
			# save the output
			my $filename = $result->query_name()."\.blastp";
			$factory->save_output($filename);
			# remove RID from list
			$factory->remove_rid($rid);
			...



When I change the following:

my $filename = $result->query_name()."\.blastp";

to 

my $filename = "temp.blastp";

and comment out the 'my $result = $rc->next_result()' line, it works again,
so possibly SearchIO?

The only difference I noticed is that older output has this:
_______________________________________________________________________

BLASTP 2.2.12 [Aug-07-2005]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schäffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman 
(1997), "Gapped BLAST and PSI-BLAST: a new generation of 
protein database search programs", Nucleic Acids Res. 25:3389-3402.

RID: 1131470802-26518-118666159798.BLASTQ3


Database: All non-redundant GenBank CDS
translations+PDB+SwissProt+PIR+PRF excluding environmental samples 
           3,023,944 sequences; 1,040,428,944 total letters
Query=  NP_249094 transcriptional regulator PyrR [Pseudomonas aeruginosa
PAO1].
          (170 letters)
....

_______________________________________________________________________

And new output has this:
_______________________________________________________________________
BLASTP 2.2.13 [Nov-27-2005]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schäffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman 
(1997), "Gapped BLAST and PSI-BLAST: a new generation of 
protein database search programs", Nucleic Acids Res. 25:3389-3402.

RID: 1137614458-7828-16730336973.BLASTQ4


Database: All non-redundant GenBank CDS
translations+PDB+SwissProt+PIR+PRF excluding environmental samples
           3,228,386 sequences; 1,108,137,318 total letters
Query=  NP_249094 pyrimidine regulatory protein PyrR [Pseudomonas aeruginosa
PAO1].
Length=170
....
_______________________________________________________________________


There is a change in the line for the length.  Is this enough to break
SearchIO::Blast?  

I think Jason is right; maybe NCBI has messed with text output and it's now
breaking the BLAST parser:

http://portal.open-bio.org/pipermail/bioperl-l/2005-November/020067.html

I may try switching over to XML output to see what happens.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at portal.open-bio.org [mailto:bioperl-l-
> bounces at portal.open-bio.org] On Behalf Of Keith Boroevich
> Sent: Wednesday, January 18, 2006 11:55 AM
> To: kaboroev at sfu.ca
> Cc: bioperl-l at portal.open-bio.org
> Subject: RE: [Bioperl-l] Trouble using RemoteBlast.pm
> 
> I'm not sure if this is related, but in the last 3 days my remote BLAST
> scripts have stop working.  I have not modified the code in any way.
> The retrieve_blast() returns successful, and next_result() does return a
> "Bio::Search::Result::BlastResult=HASH(0x15ad8d0)" object but takes a
> long time to do so.  However, next_hit returns undef.  I'm not really
> sure how to approach this problem.  Prior to 3 days ago the scripts
> worked perfectly returning a list of hits, their accession and
> significance.
> 
> Keith
> 
> 
> On Tue, 2006-01-17 at 11:34 -0700, Barry Moore wrote:
> > Nagesh-
> >
> > Did you get this figured out?  Your script works as is on my system.
> > You say temp.out is empty?  What does you input sequence
> > (blastInput.txt) look like?
> >
> > Barry
> >
> > > -----Original Message-----
> > > From: bioperl-l-bounces at portal.open-bio.org [mailto:bioperl-l-
> > > bounces at portal.open-bio.org] On Behalf Of Hubert Prielinger
> > > Sent: Monday, January 16, 2006 2:54 PM
> > > To: Nagesh Chakka; bioperl-l at portal.open-bio.org
> > > Subject: Re: [Bioperl-l] Trouble using RemoteBlast.pm
> > >
> > > Nagesh Chakka wrote:
> > >
> > > >Hi All,
> > > >I was trying to setup a system to perform a remote blast on regular
> > > basis. I
> > > >thought this could be best achieved by using BioPerl module and came
> > > across
> > > >RemoteBlast.pm
> > > >I had modified the sample script "bp_remote_blast.pl" which takes a
> > file
> > > >containing single FASTA sequence as an input. Also I wanted the blast
> > > report
> > > >to be saved in a file for latter use and
> > > >modified the code as follows
> > > >I am using the latest version of Bioperl (1.5) on a Fedora platform.
> > >
> > >#######################################################################
> > > >print "$Bio::Root::Version::VERSION\n";
> > > >use Bio::Tools::Run::RemoteBlast;
> > > >use strict;
> > > >my $prog = 'blastp';
> > > >my $db   = 'swissprot';
> > > >my $e_val= '1e-10';
> > > >
> > > >my @params = ( '-prog' => $prog,
> > > >       '-data' => $db,
> > > >       '-expect' => $e_val,
> > > >       '-readmethod' => 'SearchIO' );
> > > >
> > > >my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> > > >
> > > >#change a paramter
> > > >$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo sapiens
> > > >[ORGN]';
> > > >
> > > >#remove a parameter
> > > >delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
> > > >
> > > >my $v = 1;
> > > >#$v is just to turn on and off the messages
> > > >
> > > >my $r = $factory->submit_blast('blastInput.txt');
> > > >
> > > >print STDERR "waiting..." if( $v > 0 );
> > > >while ( my @rids = $factory->each_rid )
> > > >{
> > > >        foreach my $rid ( @rids )
> > > >        {
> > > >                my $rc = $factory->retrieve_blast($rid);
> > > >                if( !ref($rc) )
> > > >                {
> > > >                        if( $rc < 0 )
> > > >                        {
> > > >                                $factory->remove_rid($rid);
> > > >                        }
> > > >                        print STDERR "." if ( $v > 0 );
> > > >                        sleep 5;
> > > >                }
> > > >                else
> > > >                {
> > > >                        print "RID $rid\n";
> > > >                        $factory->save_output('temp.out');
> > > >                        $factory->remove_rid($rid);
> > > >                }
> > > >        }
> > > >}
> > > >
> > >
> > >#######################################################################
> > ##
> > > ########
> > > >
> > > >This script prints the RID and terminates immediately. Obviously the
> > > >output file created is empty as the program did not wait for getting
> > the
> > > >blast results from the RID.
> > > >Is there something I am doing wrong and what can I do for the program
> > to
> > > wait
> > > >until the results are ready to be printed to the output file. I could
> > not
> > > get
> > > >much information from the documentation and have no prior experience
> > with
> > > >Bioperl.
> > > >Thanks very much for  your attention.
> > > >Regards
> > > >Nageshbi
> > > >_______________________________________________
> > > >Bioperl-l mailing list
> > > >Bioperl-l at portal.open-bio.org
> > > >http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > > >
> > > >
> > > >
> > > >
> > > hi nagesh,
> > > try this, should work, I had the same problem:
> > >
> > > .......................
> > > .......................
> > >
> > > else
> > >                 {
> > >                         print "RID $rid\n";
> > >                         $factory->save_output('temp.out');
> > >
> > > 			my $checkinput = $factory->file;
> > >               		open(my $fh,"<$checkinput") or die $!;
> > >               		while(<$fh>){
> > >                 		print;
> > >               		}
> > >               		close $fh;
> > >
> > >
> > > 			$factory->remove_rid($rid);
> > >                 }
> > >         }
> > > }
> > >
> > > regards
> > > Hubert
> > >
> > > PS: are you using the composition based statistics parameter with your
> > > blast search?
> > > if yes, is it working?
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list