[Open-bio-l] [Bioperl-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy

Phillip San Miguel pmiguel at purdue.edu
Thu Mar 25 11:44:10 UTC 2010


Chris Fields wrote:
> On Mar 24, 2010, at 9:51 AM, Peter wrote:
>
>   
>> On Wed, Mar 24, 2010 at 2:37 PM, Chris Fields <cjfields at illinois.edu> wrote:
>>     
>>>
>> Please give the NCBI an email - you can CC me too if you like.
>>     
>
> Sent, have cc'd the open-bio list.  Don't want to cross-post this too much, so I think we should move the discussion there.
>
>   
>>> Re: the tool parameter, we currently set the tool itself to 'BioPerl' as a
>>> default, but always leave the email blank and issue a warning if it isn't
>>> set.  We could just as easily leave both blank and issue warnings for both.
>>>       
>> We currently leave out the email and set the tool parameter to "Biopython"
>> by default but this can be overridden. Currently leaving out the email does
>> cause Biopython to give a warning.
>>
>> Peter
>>     
>
> We follow the same, then (down to the warning).  This is mentioned in my post to them, I'll wait to see what they say.  
>
> My concern is the wording of the new rules.  Each tool and email must be registered with them if an IP is blocked.  Does this mean each tool is assigned one specific email?  And an IP that is blocked can register it to be allowed back into the fold?  With that in mind, should we register each of our toolkits with them?  Probably not a bad thing (it might help us as devs to get an idea of use), but then if one user abuses the rules will their actions affect all toolkit users?  Is this all done on a per-IP basis, per-toolkit basis, etc?  
>
> Unfortunately, at least to me, none of this is made very clear, so I'm hoping there is some clarification from their end.
>
> chris
>   
Maybe GenBank is hoping that developers will create Genbank 
rules-compliant modules when accessing their resources. That is, for 
EUtilities by default, the tools would check the local time and cut off 
requests to 100 if outside the hours of 9PM-5AM Eastern Time. Also the 
number of requests could be limited to 3 per second.

But it seems like it would be better if Genbank would return some sort 
of "load" field with the response to each request. That would allow 
feedback control of a series of requests. It could be tuned however 
Genbank likes, but past a certain threshold the client program would 
know that another request within a certain amount of time will result in 
the IP being banned.

-- 
Phillip



More information about the Open-Bio-l mailing list