[Biopython] query upper limit for NCBIWWW.qblast?
Matthias Schade
matthiasschade.de at googlemail.com
Thu Apr 11 09:20:31 UTC 2013
Hello everyone,
is there an upper limit to how many sequences I can query via
NCBIWWW.qblast at once?
Sending up to 150 sequences each of 24mer length in a single string
everything works fine. But now, I have tried the same for a string
containing about 900 sequences. On good times, it takes the NCBI-server
about 5min to send an answer. I save the answer and later open and parse
the file by other functions in my code. However, even though I have
queried the same 900 sequences, the resulting output-file varies in
length (10 MB<x<20MB) and always at least misses the correct
termination-tag in "<\BlastOutput>" or even misses more (this does not
happen why querying 150 sequences or less).
I would guess once the server has started sending its answers, there
might only be a limited time NCBIWWW.qblast waits for follow up packets
... and thus depending on the current server-load, the
NCBIWWW.qblast-function simply decides to terminate waiting for
incomming data after some time, resulting in my blast-output-files to
vary in length. Could anyone correct or verify this long-fetched hypothesis?
My core-lines are:
orgn='Mus Musculus' #on anything else
result = NCBIWWW.qblast("blastn", "nt", fasta_seq_string, expect=100,
entrez_query=str(orgn+"[orgn]"))
save_file = open ('myblast_result.xml',"w")
save_file.write(result.read())
Best regards,
Matthias
More information about the Biopython
mailing list