[Biopython-dev] [Bug 2738] Speed up GenBank parsing, in particular location parsing

Thu Jan 22 18:58:18 UTC 2009

http://bugzilla.open-bio.org/show_bug.cgi?id=2738

------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk  2009-01-22 13:58 EST -------
Created an attachment (id=1208)
 --> (http://bugzilla.open-bio.org/attachment.cgi?id=1208&action=view)
Simple test script for timing GenBank parsing

I've attached a trivial script to time parsing all the GenBank files in 
directory to help anyone wanting to benchmark this change.

(In reply to comment #1)
> However, from my limited testing using Python 2.5 on the Mac with GenBank
> files for large bacterial genomes, this may be a price worth paying.  I'll
> like independent measurements (and to check this on other platforms), but
> this does seem to more than halve the time taken to parse GenBank files!

Further testing with Python 2.5 on Linux, this time also with some large
Eurakyotics files, appears to confirm a very large speed up (most obvious on
feature rich GenBank files of course).

I still want to check this on other versions of python...

-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.