[Biopython-announce] is this supposed to be really slow?
Titus Brown
titus at caltech.edu
Fri May 25 23:31:51 UTC 2007
-> so this seems to be working, but it seems to be very slow. well, either
-> it's
-> slow, or i don't understand the complexity of what it is doing. i have
-> attempted to time this process, and it is taking about 7 seconds per record
-> to retrieve the date and drop it into my numpy array. is this because this
-> code is fetching something from the internet and that is what is taking such
-> a long time? or is there some other explanation for why this is slow (i.e.
-> my
-> terrible, non-pythonic code writing, what it is doing is actually very
-> complex
-> and i just don't get it, etc)?!?
->
-> any insight into this would be much appreciated.
Hi, Bryan,
I'm not too familiar with the underlying code, but I believe that
BioPython enforces a three second wait between record retrieval attempts
from NCBI. This is by request of NCBI; see
http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html
Since you're using the one-record-at-a-time retrieval interface, you
have a 3 second delay between retrievals.
I personally tend to just use the NCBI retrieval URLs directly, but
that's kind of ugly. There may be a higher volume retrieval system
built directly into BioPython, too.
cheers,
--titus
More information about the Biopython-announce
mailing list