[Bioperl-l] Trouble using RemoteBlast.pm
Jason Stajich
jason.stajich at duke.edu
Wed Jan 18 16:30:02 EST 2006
You may need to start requesting XML instead of plain text - NCBI may
have finally done what they warned about (http://bioperl.org/
pipermail/bioperl-l/2005-September/019687.html).
You can see information here about getting XML.
http://bioperl.open-bio.org/news/2005/11/06/getting-blastxml-using-
remoteblast/
http://bioperl.open-bio.org/wiki/Module:Bio::Tools::Run::RemoteBlast
http://bioperl.open-bio.org/wiki/NCBI_Blast_email
We'll officially announce the new news and wiki site more at the end
of the month when we switch to permanent URL but I suspect this
question needs a pointer. Feel free to add this question and answer
to the FAQ as well http://bioperl.open-bio.org/wiki/FAQ
-jason
On Jan 18, 2006, at 4:17 PM, Chris Fields wrote:
> I have had the same problem using a script I wrote. It worked
> until ~4 days
> ago. Luckily, I had saved a copy of some of my old searches in a temp
> folder so I can compare them.
>
> I noticed that if I just save the output using:
>
> $factory->save_output('temp.out');
>
> it works (just like Barry's script), but if I have the following in
> a loop
> (like in RemoteBlast POD), it craps out:
>
> while ( my @rids = $factory->each_rid ) {
> foreach my $rid ( @rids ) {
> my $rc = $factory->retrieve_blast($rid);
> # if RID is not present
> if( !ref($rc) ) {
> # remove if RID is bad (error)
> if( $rc < 0 ) {
> $factory->remove_rid($rid);
> }
> print STDERR "." if ( $v > 0 );
> sleep 5;
> } else { # RID is returned
> my $result = $rc->next_result();
> # save the output
> my $filename = $result->query_name()."\.blastp";
> $factory->save_output($filename);
> # remove RID from list
> $factory->remove_rid($rid);
> ...
>
>
>
> When I change the following:
>
> my $filename = $result->query_name()."\.blastp";
>
> to
>
> my $filename = "temp.blastp";
>
> and comment out the 'my $result = $rc->next_result()' line, it
> works again,
> so possibly SearchIO?
>
> The only difference I noticed is that older output has this:
> ______________________________________________________________________
> _
>
> BLASTP 2.2.12 [Aug-07-2005]
> Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A.
> Schäffer,
> Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman
> (1997), "Gapped BLAST and PSI-BLAST: a new generation of
> protein database search programs", Nucleic Acids Res. 25:3389-3402.
>
> RID: 1131470802-26518-118666159798.BLASTQ3
>
>
> Database: All non-redundant GenBank CDS
> translations+PDB+SwissProt+PIR+PRF excluding environmental samples
> 3,023,944 sequences; 1,040,428,944 total letters
> Query= NP_249094 transcriptional regulator PyrR [Pseudomonas
> aeruginosa
> PAO1].
> (170 letters)
> ....
>
> ______________________________________________________________________
> _
>
> And new output has this:
> ______________________________________________________________________
> _
> BLASTP 2.2.13 [Nov-27-2005]
> Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. SchÃ
> ¤ffer,
> Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman
> (1997), "Gapped BLAST and PSI-BLAST: a new generation of
> protein database search programs", Nucleic Acids Res. 25:3389-3402.
>
> RID: 1137614458-7828-16730336973.BLASTQ4
>
>
> Database: All non-redundant GenBank CDS
> translations+PDB+SwissProt+PIR+PRF excluding environmental samples
> 3,228,386 sequences; 1,108,137,318 total letters
> Query= NP_249094 pyrimidine regulatory protein PyrR [Pseudomonas
> aeruginosa
> PAO1].
> Length=170
> ....
> ______________________________________________________________________
> _
>
>
> There is a change in the line for the length. Is this enough to break
> SearchIO::Blast?
>
> I think Jason is right; maybe NCBI has messed with text output and
> it's now
> breaking the BLAST parser:
>
> http://portal.open-bio.org/pipermail/bioperl-l/2005-November/
> 020067.html
>
> I may try switching over to XML output to see what happens.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>> -----Original Message-----
>> From: bioperl-l-bounces at portal.open-bio.org [mailto:bioperl-l-
>> bounces at portal.open-bio.org] On Behalf Of Keith Boroevich
>> Sent: Wednesday, January 18, 2006 11:55 AM
>> To: kaboroev at sfu.ca
>> Cc: bioperl-l at portal.open-bio.org
>> Subject: RE: [Bioperl-l] Trouble using RemoteBlast.pm
>>
>> I'm not sure if this is related, but in the last 3 days my remote
>> BLAST
>> scripts have stop working. I have not modified the code in any way.
>> The retrieve_blast() returns successful, and next_result() does
>> return a
>> "Bio::Search::Result::BlastResult=HASH(0x15ad8d0)" object but takes a
>> long time to do so. However, next_hit returns undef. I'm not really
>> sure how to approach this problem. Prior to 3 days ago the scripts
>> worked perfectly returning a list of hits, their accession and
>> significance.
>>
>> Keith
>>
>>
>> On Tue, 2006-01-17 at 11:34 -0700, Barry Moore wrote:
>>> Nagesh-
>>>
>>> Did you get this figured out? Your script works as is on my system.
>>> You say temp.out is empty? What does you input sequence
>>> (blastInput.txt) look like?
>>>
>>> Barry
>>>
>>>> -----Original Message-----
>>>> From: bioperl-l-bounces at portal.open-bio.org [mailto:bioperl-l-
>>>> bounces at portal.open-bio.org] On Behalf Of Hubert Prielinger
>>>> Sent: Monday, January 16, 2006 2:54 PM
>>>> To: Nagesh Chakka; bioperl-l at portal.open-bio.org
>>>> Subject: Re: [Bioperl-l] Trouble using RemoteBlast.pm
>>>>
>>>> Nagesh Chakka wrote:
>>>>
>>>>> Hi All,
>>>>> I was trying to setup a system to perform a remote blast on
>>>>> regular
>>>> basis. I
>>>>> thought this could be best achieved by using BioPerl module and
>>>>> came
>>>> across
>>>>> RemoteBlast.pm
>>>>> I had modified the sample script "bp_remote_blast.pl" which
>>>>> takes a
>>> file
>>>>> containing single FASTA sequence as an input. Also I wanted the
>>>>> blast
>>>> report
>>>>> to be saved in a file for latter use and
>>>>> modified the code as follows
>>>>> I am using the latest version of Bioperl (1.5) on a Fedora
>>>>> platform.
>>>>
>>>> ###################################################################
>>>> ####
>>>>> print "$Bio::Root::Version::VERSION\n";
>>>>> use Bio::Tools::Run::RemoteBlast;
>>>>> use strict;
>>>>> my $prog = 'blastp';
>>>>> my $db = 'swissprot';
>>>>> my $e_val= '1e-10';
>>>>>
>>>>> my @params = ( '-prog' => $prog,
>>>>> '-data' => $db,
>>>>> '-expect' => $e_val,
>>>>> '-readmethod' => 'SearchIO' );
>>>>>
>>>>> my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
>>>>>
>>>>> #change a paramter
>>>>> $Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo
>>>>> sapiens
>>>>> [ORGN]';
>>>>>
>>>>> #remove a parameter
>>>>> delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
>>>>>
>>>>> my $v = 1;
>>>>> #$v is just to turn on and off the messages
>>>>>
>>>>> my $r = $factory->submit_blast('blastInput.txt');
>>>>>
>>>>> print STDERR "waiting..." if( $v > 0 );
>>>>> while ( my @rids = $factory->each_rid )
>>>>> {
>>>>> foreach my $rid ( @rids )
>>>>> {
>>>>> my $rc = $factory->retrieve_blast($rid);
>>>>> if( !ref($rc) )
>>>>> {
>>>>> if( $rc < 0 )
>>>>> {
>>>>> $factory->remove_rid($rid);
>>>>> }
>>>>> print STDERR "." if ( $v > 0 );
>>>>> sleep 5;
>>>>> }
>>>>> else
>>>>> {
>>>>> print "RID $rid\n";
>>>>> $factory->save_output('temp.out');
>>>>> $factory->remove_rid($rid);
>>>>> }
>>>>> }
>>>>> }
>>>>>
>>>>
>>>> ###################################################################
>>>> ####
>>> ##
>>>> ########
>>>>>
>>>>> This script prints the RID and terminates immediately.
>>>>> Obviously the
>>>>> output file created is empty as the program did not wait for
>>>>> getting
>>> the
>>>>> blast results from the RID.
>>>>> Is there something I am doing wrong and what can I do for the
>>>>> program
>>> to
>>>> wait
>>>>> until the results are ready to be printed to the output file. I
>>>>> could
>>> not
>>>> get
>>>>> much information from the documentation and have no prior
>>>>> experience
>>> with
>>>>> Bioperl.
>>>>> Thanks very much for your attention.
>>>>> Regards
>>>>> Nageshbi
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at portal.open-bio.org
>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>>
>>>>>
>>>>>
>>>> hi nagesh,
>>>> try this, should work, I had the same problem:
>>>>
>>>> .......................
>>>> .......................
>>>>
>>>> else
>>>> {
>>>> print "RID $rid\n";
>>>> $factory->save_output('temp.out');
>>>>
>>>> my $checkinput = $factory->file;
>>>> open(my $fh,"<$checkinput") or die $!;
>>>> while(<$fh>){
>>>> print;
>>>> }
>>>> close $fh;
>>>>
>>>>
>>>> $factory->remove_rid($rid);
>>>> }
>>>> }
>>>> }
>>>>
>>>> regards
>>>> Hubert
>>>>
>>>> PS: are you using the composition based statistics parameter
>>>> with your
>>>> blast search?
>>>> if yes, is it working?
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at portal.open-bio.org
>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
--
Jason Stajich
Duke University
http://www.duke.edu/~jes12
More information about the Bioperl-l
mailing list