[Bioperl-l] Bio::DB::EUtilities question

Chris Fields cjfields at illinois.edu
Sat Nov 21 19:19:24 UTC 2009


NCBI has specific rules about the repeated queries to its servers:

http://eutils.ncbi.nlm.nih.gov/#UserSystemRequirements

Acc. to that, if you are making over 100 requests at peak times you will run into problems (they'll probably temp-block your IP), even if the timeout is much shorter now (it's 3 requests/second, whereas a year or two ago it was once every 3 sec).  In general it's best to run something like this during off-hours.  

The actual limit on number of server requests is one specific part of Bio::DB::EUtilities that hasn't been added yet, but is tentatively planned.  

chris

On Nov 21, 2009, at 12:26 PM, Robert Bradbury wrote:

> It sounds like NCBI may be counting frequency of requests, how much data
> they send or something similar.  Are you delaying the time between fetches?
> The code I've seen typically sleeps for a few seconds each time around a
> loop.  You might try longer delays between fetches and see if that gets you
> any more data.
> 
> Alternatively perhaps the libraries aren't reusing the TCP/IP connection
> properly.  Is there a difference between the amount of memory on the
> machines?  Have you watched the size of the process to see if it grows over
> time?  I think the bug which prevented me from fetching a not-so-large
> genome from a few months ago (eating up 3GB of memory in the process) has
> not been resolved.  If so that could be your problem.
> 
> Robert
> 
> On Fri, Nov 20, 2009 at 12:44 PM, Alessandra
> <alessandra.bilardi at gmail.com>wrote:
>> 
>> 
>> I'm testing Bio::DB::EUtilities - webagent which interacts with and
>> retrieves data from NCBI's eUtils. My perl script works but it works
>> only if I request less than ~450 times get_Response function.. else I
>> have got this error message:
>> 
>> ------------- EXCEPTION -------------
>> MSG: Response Error
>> Can't connect to eutils.ncbi.nlm.nih.gov:80 (connect: No route to host)
>> STACK Bio::DB::GenericWebAgent::get_Response
>> /usr/local/share/perl/5.10.0/Bio/DB/GenericWebAgent.pm:215
>> STACK toplevel ./wget4gbk.pl:77
>> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l





More information about the Bioperl-l mailing list