[Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy

Chris Fields cjfields at illinois.edu
Thu Mar 25 15:43:24 UTC 2010


On Mar 25, 2010, at 8:18 AM, Peter wrote:

> On Thu, Mar 25, 2010 at 1:10 PM, Chris Fields <cjfields at illinois.edu> wrote:
>> Andy, Ewan,
>> 
>> Yes, that's what I meant; I do not think a set of defaults is a good idea.
> 
> Why? I agree that putting a default project email address in is a bad
> idea, but having a default tool seems fine. Perhaps I have misunderstood
> you.
> 
> If any Biopython/BioPerl user has written a dozen scripts using
> Entrez should they really be expected to give them all a (unique) tool
> name in the Entrez requests? Having it default to Biopython/BioPerl
> seems reasonable to me (in combination with the script writer's email
> address).
> 
> The whole hassle about registering a tool+email is only if you need your
> IP address unblocked, typically if you or someone at your institute or
> ISP has previously abused the servers.


Let's play devil's advocate.  The best way I can think of to describe this is to lay out a possible scenario.  Suppose an end user (out of possibly thousands of end users, scattered across many IPs) uses one of the eutils modules/classes where the tool is set but the email isn't (our current status).  They set 'email' to their local one, and then proceed to somehow abuse NCBI's rules and are blocked.  In order to be reinstated, they will have to register both the tool and email with NCBI.  Until then, does this block anyone else with the same tool?  Just those from that IP?  Not clear at the moment.

To proceed further, now the user registers the tool and email (both need to be registered to unblock).  If I understand the eutils documents correctly, as stated, only one email (supposedly the software developers) is registered per tool (also supposedly the software developers).  The 'Bio*' tool name could end up being registered by anyone (wittingly or unwittingly), using their own personal email.  If another user uses the same tool name with a different email, would they be blocked?  If not, and that user tries to register as above (after subsequent abuse), would a conflict occur and they be notified of the prior registration?  Again, it's not clear what happens.  

We have until probably sometime in May to decide a course of action (June 1 is the enforcement date I believe), but this relies on NCBI clarifying a few things first.  The current documentation (at least to me) does seem to indicate that each tool must have a single corresponding email when registered.  Unless it is clarified, from my perspective the only safe course that addresses all concerns is to leave both tool and email unset, and then register a respective toolkit/email to keep it within the specific dev group as a safeguard.  That last bit is for many reasons I've already outlined; an additional one is the fact that we already have a default set for 'tool' now (and have had one set for a while), so by legacy anyone using older versions will have 'Bio*' preset already when June 1 hits.

BTW, I don't consider the above scenario out of the realm of possibility, particularly if they truly intend on enforcing the rules this time around.  We've had many users who have asked the question 'how can I download my batch of 1,00,000 records via eutils'.  Potential lack of common sense doesn't stop the persistent or the desperate.


> Again we come back to the fact the new NCBI guidelines are still
> unclear.
> 
>>  The other advantage to registering them is the list would get immediate
>> updates from NCBI when changes occur (instead of finding out about
>> them second-hand from other subscribers).  The list is very low traffic.
> 
> Well that is an advantage, but in practice having a few people from
> each project on the NCBI mailing list isn't a big hassle.
> 
> Peter

Right, but my point is there are no intermediaries, the news goes straight to the list.  We're not reliant on (possibly busy, possibly absent) developers for second-hand news.  We've been bitten by this before many times with NCBI, both with eutils and BLAST changes, a good many which NCBI announced but not passed on to our mailing list.

Saying this now, it makes me wonder whether we should have a master list of some sort to gather such announcements that may impact developers (eutils, BLAST, GenBank/EMBL/UniProt releases, etc).

chris






More information about the Open-Bio-l mailing list