[Biopython-dev] [Bug 1747] GenBank parser is very slow and memory hungry for large input files

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Fri Nov 4 14:00:16 EST 2005


http://bugzilla.open-bio.org/show_bug.cgi?id=1747





------- Comment #7 from mdehoon at ims.u-tokyo.ac.jp  2005-11-04 14:00 -------
This patch causes an error when running the example in section 3.4.1 in the
tutorial/cookbook:

Python 2.4.1 (#1, Aug 25 2005, 12:45:44)
[GCC 3.4.4 (cygming special) (gdc 0.12, using dmd 0.125)] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from Bio import GenBank
>>>
>>> gi_list = GenBank.search_for("Opuntia AND rpl16")
>>> ncbi_dict = GenBank.NCBIDictionary('nucleotide', 'genbank')
>>> gb_record = ncbi_dict[gi_list[0]]
>>> record_parser = GenBank.FeatureParser()
>>> ncbi_dict = GenBank.NCBIDictionary('nucleotide', 'genbank',parser = record_
parser)
>>> gb_seqrecord = ncbi_dict[gi_list[0]]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/local/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line
17
36, in __getitem__
    return self.parser.parse(handle)
  File "/usr/local/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line
21
9, in parse
    self._scanner.feed(handle, self._consumer)
  File "/usr/local/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line
12
61, in feed
    line = handle.readline()
AttributeError: ReseekFile instance has no attribute 'readline'
>>>

Can this be fixed? I'm pretty much in favor of a hand-written parser instead of
Martel, because it's easier to understand and maintain (there are several other
GenBank bugs waiting).




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


More information about the Biopython-dev mailing list