[Bioperl-l] Trouble using RemoteBlast.pm
Barry Moore
barry.moore at genetics.utah.edu
Thu Jan 19 00:15:06 EST 2006
Nagesh,
That does sound odd. What version of bioperl are you using? I'm
guessing 1.4? If the answer is anything but 1.5 something, then I
suggest you should upgrade before going any further. You will also
want to follow the current thread by about parsing XML formatted
blast reports. I don't think this is your problem right now, but
eventually you'll have a problem if you aren't parsing XML format as
discussed in that post. I've added some more detail below if you are
having the problem with 1.5 try some debugging.
Here's what's going on (or should be going on) in your script, and
some suggestions for using the debugger.
#This next line hits the NCBI server, and if it gets a blast report
in return parses it, and returns a Bio::Tools::Blast object. If
there was no report you get 0, and if there was an error you get -1.
my $rc = $factory->retrieve_blast($rid);
print "RC $rc\n";
#This if statement is checking to see if the server has NOT returned
a report yet. If it did then $rc should be an object and ref $rc
will return 'Bio::SearchIIO::blast'. If $rc is not an object (i.e.
you got no report) then ref $rc returns undef.
if( !ref($rc) )
{
#If you got here then you got no report from NCBI server yet, and so
the next if check is you got -1 meaning there was an error. On error
delete this RID cause it's no good.
if( $rc < 0 )
{
$factory->remove_rid($rid);
}
#Print a dot on the screen in leu of music to keep the user
entertained while they wait.
print STDERR "." if ( $v > 0 );
#Take a nap so you don't piss off NCBI sys admin!
sleep 5;
}
#Getting here means that $rc was an object, so we've got a report.
Go ahead and save it.
else
{
sleep 600;
#Obviously writing your output file.
$factory->save_output('temp.out');
my $checkinput = $factory->file;
open(my $fh,"<$checkinput") or die $!;
while(<$fh>)
{
print;
}
close $fh;
$factory->remove_rid($rid);
run your script in the debugger like this:
perl -d your_script.pl
Step forward one line at a time by typing 'n'.
When you get just past my $rc = $factory->retrieve_blast($rid); type
'x $rc'
You should get 0, -1 or 'Bio::SearchIO::blast'
Keep stepping forward with 'n'.
If you get 0 you should loop back to retrieve_blast after a sleep.
If you get -1 you should end your script - you got an error (What was
it?)
If you get an Bio::SearchIO::blast object then you should be writing
a temp.out
Barry
On Jan 18, 2006, at 6:37 PM, Nagesh wrote:
> Thanks very much to all specially to Barry and Hubert for their
> time in
> answering my query. Some updates into my problem.
>
> I have performed some diagnostics tests and writing below my
> observations.
>
> First of all, the problem in the code was that it was not waiting for
> the results to be ready for writing it to the output file. So I wanted
> to check whether the condition "if( !ref($rc) )" is ever satisfied
> and I
> printed out the $rc value which was some thing like "Bio::SearchIO::
> blast=HASH(0x9010370)". When I had looked at the Bioperl documentation
> for RemoteBlast.pm, the value for $rc in "$rc = $factory-
> >retrieve_blast
> ($rid);" should either return 0 or 1. I am not able to understand
> whether what I am getting is right.
>
> Secondly, I had manually forced the script to wait between
> submit_blast,
> retrieve_blast and save_output by using sleep with values ranging from
> 30 to 600. None of them where successful in saving the output.
>
> When sleep (600) is between submit_blast and retrieve_blast, the
> following is printed onto std output (shown below is part of the
> output)
> with output file still empty.
>
> <P><table>
> <tr><td>Request ID</td><td> <b>1137626804-16566-100302560340.BLASTQ4</
> b></td></tr>
> <tr><td>Status</td><td>Searching</td></tr>
> <tr><td>Submitted at</td><td>Wed Jan 18 18:26:44 2006</td></tr>
> <tr><td>Current time</td><td>Wed Jan 18 18:36:46 2006</td></tr>
> <tr><td>Time since submission</td>
> <td>00:10:01</td>
> </tr><P></table>
> <p><hr>This page will be automatically updated in <b>10</b> seconds
> until search is done<BR>
>
> When sleep (600) is between retrieve_blast and save_output, the
> following is printed with nothing written to output file.
>
> <P><table>
> <tr><td>Request ID</td><td> <b>1137632221-28820-85178967709.BLASTQ1</
> b></td></tr>
> <tr><td>Status</td><td>Searching</td></tr>
> <tr><td>Submitted at</td><td>Wed Jan 18 19:57:01 2006</td></tr>
> <tr><td>Current time</td><td>Wed Jan 18 19:57:03 2006</td></tr>
> <tr><td>Time since submission</td>
> <td>00:00:01</td>
> </tr><P></table>
> <p><hr>This page will be automatically updated in <b>10</b> seconds
> until search is done<BR>
>
> Please note the difference in time since submission.
>
> Lastly, I had printed out the request ID and manually paused the
> script
> by using <STDIN> between submit_blast and retrieve_blast. The idea was
> to check the status of the job online through the NCBI website.
> When the
> results where ready, I made the script to proceed further and was able
> to save the desired results to the file. I am puzzled with this
> observation as I am not understanding why manually formating the
> results
> online helps in getting the results.
> I am basically a molecular biologist and trying hard to solve this
> computational stuff, so there might be some trivial issues
> according to
> you computer wiz :)
>
> Barry suggested me to use perl debugger which I will try to use.
>
> Thanks for your attention.
>
> Below is the code which was being tested.
>
> ######################################################################
> ##
>
> use strict;
> use warnings;
> use Bio::Tools::Run::RemoteBlast;
>
> print "$Bio::Root::Version::VERSION\n";
> my $prog = 'blastp';
> my $db = 'swissprot';
> my $e_val= '1e-10';
>
> my @params = ( '-prog' => $prog,
> '-data' => $db,
> '-expect' => $e_val,
> '-readmethod' => 'SearchIO' );
>
> my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
>
> #change a paramter
> $Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo sapiens
> [ORGN]';
>
> #remove a parameter
> delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
>
> my $v = 1;
> #$v is just to turn on and off the messages
>
> my $r = $factory->submit_blast('blastInput.txt');
>
> print STDERR "waiting..." if( $v > 0 );
> while ( my @rids = $factory->each_rid )
> {
> foreach my $rid ( @rids )
> {
>
> print "RID $rid\n";
>
> #<STDIN>;
> #sleep 600;
> my $rc = $factory->retrieve_blast($rid);
>
> print "RC $rc\n";
> if( !ref($rc) )
> {
> if( $rc < 0 )
> {
> $factory->remove_rid($rid);
> }
> print STDERR "." if ( $v > 0 );
> sleep 5;
> }
> else
> {
> sleep 600;
> $factory->save_output('temp.out');
> my $checkinput = $factory->file;
> open(my $fh,"<$checkinput") or die $!;
> while(<$fh>)
> {
> print;
> }
> close $fh;
> $factory->remove_rid($rid);
> }
> }
> }
>
> ######################################################################
> ##
>
>
> On Tue, 2006-01-17 at 16:03 -0700, Barry Moore wrote:
>> Nagesh,
>>
>> Attached is an input file, script and output. These work for me,
>> and I
>> think they are the same that you are using. Have a look and see
>> if you
>> can find any differences that might be causing you problem. Other
>> than
>> that I don't know what to tell you. If you are familiar with the
>> perl
>> debugger you (and if you're not, now's probably a good time to become
>> familiar with it) you should step through you script and be sure that
>> all of you're objects are getting defined when they are supposed
>> to be.
>> That can often help narrow down the problem.
>>
>> Barry
>>
>>> -----Original Message-----
>>> From: Nagesh Chakka [mailto:nagesh.chakka at anu.edu.au]
>>> Sent: Tuesday, January 17, 2006 1:57 PM
>>> To: Barry Moore
>>> Cc: Hubert Prielinger; bioperl-l at bioperl.org
>>> Subject: Re: [Bioperl-l] Trouble using RemoteBlast.pm
>>>
>>> Bi Barry,
>>> With the help of Hubert, I further modified the script but still
>>> have
>> the
>>> same
>>> problem. The problem is that from the point of submitting the blast
>> query,
>>> the script does not wait until the blast results are ready for
>> retrieval
>>> and
>>> event of submission is immediately followed by retrieving and saving
>> the
>>> output. Since the results will not be ready (about a sec) this fast,
>> the
>>> output created is blank. I am able to retrieve the results online
>> using
>>> the
>>> RID which I am making the script to print.
>>> So my main problem is making the program to wait after
>>> submitting the
>>> result.
>>> My input file has a single fasta sequence which I have pasted below.
>>> Its interesting to note that the script works on your system. Is it
>>> creating
>>> an output file with the blast report?
>>> Thanks very much for your attention.
>>> Regards
>>> Nagesh
>>>
>>> blastInput.txt
>>>> MusDpl
>>>
>> MKNRLGTWWVAILCMLLASHLSTVKARGIKHRFKWNRKVLPSSGGQITEARVAENRPGAFIKQGRKLDI
>> DFG
>> AE
>>> GNRYYA
>>>
>> ANYWQFPDGIYYEGCSEANVTKEMLVTSCVNATQAANQAEFSREKQDSKLHQRVLWRLIKEICSAKHCD
>> FWL
>> ER
>>> GAAL
>>> RVAVDQPAMVCLLGFVWFIVK
>>>
>>> On Wednesday 18 January 2006 05:34, Barry Moore wrote:
>>>> Nagesh-
>>>>
>>>> Did you get this figured out? Your script works as is on my
>>>> system.
>>>> You say temp.out is empty? What does you input sequence
>>>> (blastInput.txt) look like?
>>>>
>>>> Barry
>>>>
>>>>> -----Original Message-----
>>>>> From: bioperl-l-bounces at portal.open-bio.org [mailto:bioperl-l-
>>>>> bounces at portal.open-bio.org] On Behalf Of Hubert Prielinger
>>>>> Sent: Monday, January 16, 2006 2:54 PM
>>>>> To: Nagesh Chakka; bioperl-l at portal.open-bio.org
>>>>> Subject: Re: [Bioperl-l] Trouble using RemoteBlast.pm
>>>>>
>>>>> Nagesh Chakka wrote:
>>>>>> Hi All,
>>>>>> I was trying to setup a system to perform a remote blast on
>> regular
>>>>>
>>>>> basis. I
>>>>>
>>>>>> thought this could be best achieved by using BioPerl module and
>> came
>>>>>
>>>>> across
>>>>>
>>>>>> RemoteBlast.pm
>>>>>> I had modified the sample script "bp_remote_blast.pl" which takes
>> a
>>>>
>>>> file
>>>>
>>>>>> containing single FASTA sequence as an input. Also I wanted the
>> blast
>>>>>
>>>>> report
>>>>>
>>>>>> to be saved in a file for latter use and
>>>>>> modified the code as follows
>>>>>> I am using the latest version of Bioperl (1.5) on a Fedora
>> platform.
>>>>>
>>>>
>>> ####################################################################
>>> ###
>>>>>
>>>>>> print "$Bio::Root::Version::VERSION\n";
>>>>>> use Bio::Tools::Run::RemoteBlast;
>>>>>> use strict;
>>>>>> my $prog = 'blastp';
>>>>>> my $db = 'swissprot';
>>>>>> my $e_val= '1e-10';
>>>>>>
>>>>>> my @params = ( '-prog' => $prog,
>>>>>> '-data' => $db,
>>>>>> '-expect' => $e_val,
>>>>>> '-readmethod' => 'SearchIO' );
>>>>>>
>>>>>> my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
>>>>>>
>>>>>> #change a paramter
>>>>>> $Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo
>> sapiens
>>>>>> [ORGN]';
>>>>>>
>>>>>> #remove a parameter
>>>>>> delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
>>>>>>
>>>>>> my $v = 1;
>>>>>> #$v is just to turn on and off the messages
>>>>>>
>>>>>> my $r = $factory->submit_blast('blastInput.txt');
>>>>>>
>>>>>> print STDERR "waiting..." if( $v > 0 );
>>>>>> while ( my @rids = $factory->each_rid )
>>>>>> {
>>>>>> foreach my $rid ( @rids )
>>>>>> {
>>>>>> my $rc = $factory->retrieve_blast($rid);
>>>>>> if( !ref($rc) )
>>>>>> {
>>>>>> if( $rc < 0 )
>>>>>> {
>>>>>> $factory->remove_rid($rid);
>>>>>> }
>>>>>> print STDERR "." if ( $v > 0 );
>>>>>> sleep 5;
>>>>>> }
>>>>>> else
>>>>>> {
>>>>>> print "RID $rid\n";
>>>>>> $factory->save_output('temp.out');
>>>>>> $factory->remove_rid($rid);
>>>>>> }
>>>>>> }
>>>>>> }
>>>>>
>>>>
>>> ####################################################################
>>> ###
>>>>
>>>> ##
>>>>
>>>>> ########
>>>>>
>>>>>> This script prints the RID and terminates immediately. Obviously
>> the
>>>>>> output file created is empty as the program did not wait for
>> getting
>>>>
>>>> the
>>>>
>>>>>> blast results from the RID.
>>>>>> Is there something I am doing wrong and what can I do for the
>> program
>>>>
>>>> to
>>>>
>>>>> wait
>>>>>
>>>>>> until the results are ready to be printed to the output file. I
>> could
>>>>
>>>> not
>>>>
>>>>> get
>>>>>
>>>>>> much information from the documentation and have no prior
>> experience
>>>>
>>>> with
>>>>
>>>>>> Bioperl.
>>>>>> Thanks very much for your attention.
>>>>>> Regards
>>>>>> Nageshbi
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at portal.open-bio.org
>>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>> hi nagesh,
>>>>> try this, should work, I had the same problem:
>>>>>
>>>>> .......................
>>>>> .......................
>>>>>
>>>>> else
>>>>> {
>>>>> print "RID $rid\n";
>>>>> $factory->save_output('temp.out');
>>>>>
>>>>> my $checkinput = $factory->file;
>>>>> open(my $fh,"<$checkinput") or die $!;
>>>>> while(<$fh>){
>>>>> print;
>>>>> }
>>>>> close $fh;
>>>>>
>>>>>
>>>>> $factory->remove_rid($rid);
>>>>> }
>>>>> }
>>>>> }
>>>>>
>>>>> regards
>>>>> Hubert
>>>>>
>>>>> PS: are you using the composition based statistics parameter with
>> your
>>>>> blast search?
>>>>> if yes, is it working?
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at portal.open-bio.org
>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list