[Biopython] how Entrez.parse() internally work

c.buhtz at posteo.jp c.buhtz at posteo.jp
Thu Dec 10 11:10:00 UTC 2015


On 2015-12-09 21:25 Peter Cock <p.j.a.cock at googlemail.com> wrote:
> Almost certainly asking for 5 GB like that will fail. You should
> request much smaller batches of data, by making multiple
> calls to efetch with an increasing start value.

Exactly that is what I want to prevend when using eFetch.parse() - if
it is possible. That is what my question is about. ;)

Using parse() is not only about use of my RAM - it is about workload
for the NCBI-servers.

> > When I call Entrez.eFetch(retmax=999999)?
> > Or is physically/really only one record (some KBytes, not much)
> > transfered from NCBI to me while each iteration (or next())?
> 
> It should be a few Kbytes at a time as each record is parsed.

Nice, then I see no need to separate my requests on NCBI because
parse() does his for me when I iterate with it.

Maybe I misunderstand it? ;)
-- 
GnuPGP-Key ID 0751A8EC


More information about the Biopython mailing list