[Bioperl-l] Trouble using RemoteBlast.pm

Jason Stajich jason.stajich at duke.edu
Wed Jan 18 16:30:02 EST 2006


You may need to start requesting XML instead of plain text - NCBI may  
have finally done what they warned about (http://bioperl.org/ 
pipermail/bioperl-l/2005-September/019687.html).

You can see information here about getting XML.

http://bioperl.open-bio.org/news/2005/11/06/getting-blastxml-using- 
remoteblast/
http://bioperl.open-bio.org/wiki/Module:Bio::Tools::Run::RemoteBlast
http://bioperl.open-bio.org/wiki/NCBI_Blast_email

We'll officially announce the new news and wiki site more at the end  
of the month when we switch to permanent URL but I suspect this  
question needs a pointer.  Feel free to add this question and answer  
to the FAQ as well http://bioperl.open-bio.org/wiki/FAQ

-jason
On Jan 18, 2006, at 4:17 PM, Chris Fields wrote:

> I have had the same problem using a script I wrote.  It worked  
> until ~4 days
> ago.  Luckily, I had saved a copy of some of my old searches in a temp
> folder so I can compare them.
>
> I noticed that if I just save the output using:
>
> $factory->save_output('temp.out');
>
> it works (just like Barry's script), but if I have the following in  
> a loop
> (like in RemoteBlast POD), it craps out:
>
> while ( my @rids = $factory->each_rid ) {
> 	foreach my $rid ( @rids ) {
> 		my $rc = $factory->retrieve_blast($rid);	
> 		# if RID is not present
> 		if( !ref($rc) ) {
> 			# remove if RID is bad (error)
> 			if( $rc < 0 ) {
> 				$factory->remove_rid($rid);
> 			}
> 			print STDERR "." if ( $v > 0 );
> 			sleep 5;
> 		} else { # RID is returned
> 			my $result = $rc->next_result();
> 			# save the output
> 			my $filename = $result->query_name()."\.blastp";
> 			$factory->save_output($filename);
> 			# remove RID from list
> 			$factory->remove_rid($rid);
> 			...
>
>
>
> When I change the following:
>
> my $filename = $result->query_name()."\.blastp";
>
> to
>
> my $filename = "temp.blastp";
>
> and comment out the 'my $result = $rc->next_result()' line, it  
> works again,
> so possibly SearchIO?
>
> The only difference I noticed is that older output has this:
> ______________________________________________________________________ 
> _
>
> BLASTP 2.2.12 [Aug-07-2005]
> Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A.  
> Schäffer,
> Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman
> (1997), "Gapped BLAST and PSI-BLAST: a new generation of
> protein database search programs", Nucleic Acids Res. 25:3389-3402.
>
> RID: 1131470802-26518-118666159798.BLASTQ3
>
>
> Database: All non-redundant GenBank CDS
> translations+PDB+SwissProt+PIR+PRF excluding environmental samples
>            3,023,944 sequences; 1,040,428,944 total letters
> Query=  NP_249094 transcriptional regulator PyrR [Pseudomonas  
> aeruginosa
> PAO1].
>           (170 letters)
> ....
>
> ______________________________________________________________________ 
> _
>
> And new output has this:
> ______________________________________________________________________ 
> _
> BLASTP 2.2.13 [Nov-27-2005]
> Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schà 
> ¤ffer,
> Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman
> (1997), "Gapped BLAST and PSI-BLAST: a new generation of
> protein database search programs", Nucleic Acids Res. 25:3389-3402.
>
> RID: 1137614458-7828-16730336973.BLASTQ4
>
>
> Database: All non-redundant GenBank CDS
> translations+PDB+SwissProt+PIR+PRF excluding environmental samples
>            3,228,386 sequences; 1,108,137,318 total letters
> Query=  NP_249094 pyrimidine regulatory protein PyrR [Pseudomonas  
> aeruginosa
> PAO1].
> Length=170
> ....
> ______________________________________________________________________ 
> _
>
>
> There is a change in the line for the length.  Is this enough to break
> SearchIO::Blast?
>
> I think Jason is right; maybe NCBI has messed with text output and  
> it's now
> breaking the BLAST parser:
>
> http://portal.open-bio.org/pipermail/bioperl-l/2005-November/ 
> 020067.html
>
> I may try switching over to XML output to see what happens.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>> -----Original Message-----
>> From: bioperl-l-bounces at portal.open-bio.org [mailto:bioperl-l-
>> bounces at portal.open-bio.org] On Behalf Of Keith Boroevich
>> Sent: Wednesday, January 18, 2006 11:55 AM
>> To: kaboroev at sfu.ca
>> Cc: bioperl-l at portal.open-bio.org
>> Subject: RE: [Bioperl-l] Trouble using RemoteBlast.pm
>>
>> I'm not sure if this is related, but in the last 3 days my remote  
>> BLAST
>> scripts have stop working.  I have not modified the code in any way.
>> The retrieve_blast() returns successful, and next_result() does  
>> return a
>> "Bio::Search::Result::BlastResult=HASH(0x15ad8d0)" object but takes a
>> long time to do so.  However, next_hit returns undef.  I'm not really
>> sure how to approach this problem.  Prior to 3 days ago the scripts
>> worked perfectly returning a list of hits, their accession and
>> significance.
>>
>> Keith
>>
>>
>> On Tue, 2006-01-17 at 11:34 -0700, Barry Moore wrote:
>>> Nagesh-
>>>
>>> Did you get this figured out?  Your script works as is on my system.
>>> You say temp.out is empty?  What does you input sequence
>>> (blastInput.txt) look like?
>>>
>>> Barry
>>>
>>>> -----Original Message-----
>>>> From: bioperl-l-bounces at portal.open-bio.org [mailto:bioperl-l-
>>>> bounces at portal.open-bio.org] On Behalf Of Hubert Prielinger
>>>> Sent: Monday, January 16, 2006 2:54 PM
>>>> To: Nagesh Chakka; bioperl-l at portal.open-bio.org
>>>> Subject: Re: [Bioperl-l] Trouble using RemoteBlast.pm
>>>>
>>>> Nagesh Chakka wrote:
>>>>
>>>>> Hi All,
>>>>> I was trying to setup a system to perform a remote blast on  
>>>>> regular
>>>> basis. I
>>>>> thought this could be best achieved by using BioPerl module and  
>>>>> came
>>>> across
>>>>> RemoteBlast.pm
>>>>> I had modified the sample script "bp_remote_blast.pl" which  
>>>>> takes a
>>> file
>>>>> containing single FASTA sequence as an input. Also I wanted the  
>>>>> blast
>>>> report
>>>>> to be saved in a file for latter use and
>>>>> modified the code as follows
>>>>> I am using the latest version of Bioperl (1.5) on a Fedora  
>>>>> platform.
>>>>
>>>> ################################################################### 
>>>> ####
>>>>> print "$Bio::Root::Version::VERSION\n";
>>>>> use Bio::Tools::Run::RemoteBlast;
>>>>> use strict;
>>>>> my $prog = 'blastp';
>>>>> my $db   = 'swissprot';
>>>>> my $e_val= '1e-10';
>>>>>
>>>>> my @params = ( '-prog' => $prog,
>>>>>       '-data' => $db,
>>>>>       '-expect' => $e_val,
>>>>>       '-readmethod' => 'SearchIO' );
>>>>>
>>>>> my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
>>>>>
>>>>> #change a paramter
>>>>> $Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo  
>>>>> sapiens
>>>>> [ORGN]';
>>>>>
>>>>> #remove a parameter
>>>>> delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
>>>>>
>>>>> my $v = 1;
>>>>> #$v is just to turn on and off the messages
>>>>>
>>>>> my $r = $factory->submit_blast('blastInput.txt');
>>>>>
>>>>> print STDERR "waiting..." if( $v > 0 );
>>>>> while ( my @rids = $factory->each_rid )
>>>>> {
>>>>>        foreach my $rid ( @rids )
>>>>>        {
>>>>>                my $rc = $factory->retrieve_blast($rid);
>>>>>                if( !ref($rc) )
>>>>>                {
>>>>>                        if( $rc < 0 )
>>>>>                        {
>>>>>                                $factory->remove_rid($rid);
>>>>>                        }
>>>>>                        print STDERR "." if ( $v > 0 );
>>>>>                        sleep 5;
>>>>>                }
>>>>>                else
>>>>>                {
>>>>>                        print "RID $rid\n";
>>>>>                        $factory->save_output('temp.out');
>>>>>                        $factory->remove_rid($rid);
>>>>>                }
>>>>>        }
>>>>> }
>>>>>
>>>>
>>>> ################################################################### 
>>>> ####
>>> ##
>>>> ########
>>>>>
>>>>> This script prints the RID and terminates immediately.  
>>>>> Obviously the
>>>>> output file created is empty as the program did not wait for  
>>>>> getting
>>> the
>>>>> blast results from the RID.
>>>>> Is there something I am doing wrong and what can I do for the  
>>>>> program
>>> to
>>>> wait
>>>>> until the results are ready to be printed to the output file. I  
>>>>> could
>>> not
>>>> get
>>>>> much information from the documentation and have no prior  
>>>>> experience
>>> with
>>>>> Bioperl.
>>>>> Thanks very much for  your attention.
>>>>> Regards
>>>>> Nageshbi
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at portal.open-bio.org
>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>>
>>>>>
>>>>>
>>>> hi nagesh,
>>>> try this, should work, I had the same problem:
>>>>
>>>> .......................
>>>> .......................
>>>>
>>>> else
>>>>                 {
>>>>                         print "RID $rid\n";
>>>>                         $factory->save_output('temp.out');
>>>>
>>>> 			my $checkinput = $factory->file;
>>>>               		open(my $fh,"<$checkinput") or die $!;
>>>>               		while(<$fh>){
>>>>                 		print;
>>>>               		}
>>>>               		close $fh;
>>>>
>>>>
>>>> 			$factory->remove_rid($rid);
>>>>                 }
>>>>         }
>>>> }
>>>>
>>>> regards
>>>> Hubert
>>>>
>>>> PS: are you using the composition based statistics parameter  
>>>> with your
>>>> blast search?
>>>> if yes, is it working?
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at portal.open-bio.org
>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12





More information about the Bioperl-l mailing list