[Bioperl-l] Not catching an error in EUtilities
Chris Fields
cjfields at uiuc.edu
Mon Oct 29 15:42:43 UTC 2007
On Oct 28, 2007, at 11:03 PM, Warren Gallin wrote:
> I've been having an intermittent problem, when the NCBI service that
> is accessed by EUtilities (in particular efetch) is not available.
>
> I have the calls enclosed in eval statements, but an exception is not
> thrown. Instead I find that the file that should contain the
> results of the efetch only has the following text:
>
> Error: The resource is temporarily unavailable
>
> So it appears that this is not generating an exception (maybe that is
> not desireable in general, but it would be useful in my case).
>
> The result is that my script tries to access the file using
> Bio::Seq::IO, and the expected text is not there.
>
> The relevant snippet of code and the output are copied below.
>
> The only way that I can think of to catch this outcome is to open the
> file and check the firt three lines for this text, and then go back
> and redo the efetch if this particular text is found.
>
> Is this behaviour considered a bug, or just an outcome that needs to
> be checked for when code is written using efetch?
...
> If so, is there a standard way of checking for this?
I may have already mentioned this (it is mentioned in the POD), but
it's worth repeating: if you running a script to post more than 100
requests you should be running it btwn 9pm and 5am ET per NCBI's rules:
http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html
I have built in a way to pass in an LWP::UserAgent-related callback
to get_Response() using the -cb parameter, primarily to allow piping
data to a child process for instance. This could also be used for
checking the initial data retrieved and throwing if an error is
returned. I haven't extensively tested this yet; I can try running a
stress test to see if I can trigger an efetch error and catch it in
the callback. If I can get something working I'll post it later
today and may incorporate it into EUtilities in CVS.
A bit of background: efetch is the only eutil that doesn't have a
specific parser (Bio::Tools::EUtilities) attached to it, primarily b/
c the data retrieved is very diverse (seq/pubmed/snp/etc data in XML,
asn.1, text, HTML). All other eutils besides efetch generate error
codes via the HTTP::Response header (the norm) or in the XML
returned; both of the previous types are errors that EUtilities
catches and throws, so an eval{} works. efetch errors seen to be
atypical and may be related to the server load or specific database
availability.
chris
> Thanks,
>
> Warren Gallin
>
> [Code that successfully executes an epost and retrieves the history]
>
> RETRIEVE_LIST: eval {$prot_eutil->reset_parameters(
> -eutil => 'efetch',
> -rettype => 'genbank',
> -db => 'protein',
> -history => $history
> );};
> if ($@){
> print "efetch error trapped\n$@\n";
> goto RETRIEVE_LIST;
>
> }
>
> $file1 = ">" . $file1;
> $retry = 0;
> eval { $prot_eutil->get_Response( -file => $file1 ); };
> if ($@) {
> die "Server error: $@. Try again later" if $retry == 5;
> print STDERR "$@\n";
> print STDERR "Server error, redo #$retry\n";
> $retry++;
> sleep(5);
> goto RETRIEVE_LIST;
> }
> else {
> print "efetch ran on $loop_bottom through $loop_top.\n";
> }
>
> The output to the terminal from this part of the code is:
>
> efetch ran on 0 through 300.
More information about the Bioperl-l
mailing list