[Biopython-announce] is this supposed to be really slow?

Titus Brown titus at caltech.edu
Fri May 25 23:31:51 UTC 2007


-> so this seems to be working, but it seems to be very slow.  well, either
-> it's
-> slow, or i don't understand the complexity of what it is doing.  i have
-> attempted to time this process, and it is taking about 7 seconds per record
-> to retrieve the date and drop it into my numpy array.  is this because this
-> code is fetching something from the internet and that is what is taking such
-> a long time?  or is there some other explanation for why this is slow (i.e.
-> my
-> terrible, non-pythonic code writing, what it is doing is actually very
-> complex
-> and i just don't get it, etc)?!?
-> 
-> any insight into this would be much appreciated.

Hi, Bryan,

I'm not too familiar with the underlying code, but I believe that
BioPython enforces a three second wait between record retrieval attempts
from NCBI.  This is by request of NCBI; see

	http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html

Since you're using the one-record-at-a-time retrieval interface, you
have a 3 second delay between retrievals.

I personally tend to just use the NCBI retrieval URLs directly, but
that's kind of ugly.  There may be a higher volume retrieval system
built directly into BioPython, too.

cheers,
--titus



More information about the Biopython-announce mailing list