[BioPython] GenBank records again
JINLING HUANG
jinling at cs.uga.edu
Wed Feb 26 22:21:48 EST 2003
Jeff and all,
Thank you very much for the GenBank record things. Now I am trying to
retrieve protein sequences with a file of GenBank ids. My script is the following:
from Bio import GenBank
import sys
file = sys.argv[1]
fp1 = open(file, 'r+') #file of gi
ids = fp1.read()
lids = ids.split()
recNum = len(lids)
protein_ncbi_dict = GenBank.NCBIDictionary(database='protein',
format='gp', parser=GenBank.FeatureParser())
for i in range(0, recNum):
gb_record = protein_ncbi_dict[lids[i]]
print '>'+ gb_record.id[0:-2] + ' ' + gb_record.seq.data
The script works well most of the time, but sometimes it gives an error
message:
Traceback (most recent call last):
File "getGBRecords.py", line 25, in ?
gb_record = protein_ncbi_dict[lids[i]]
File "/bio/python2.2/lib/python2.2/site-packages/Bio/GenBank/__init__.py", line
1563, in __getitem__ return self.parser.parse(handle)
File "/bio/python2.2/lib/python2.2/site-packages/Bio/GenBank/__init__.py", line
268, in parse self._scanner.feed(handle, self._consumer)
File "/bio/python2.2/lib/python2.2/site-packages/Bio/GenBank/__init__.py", line
1255, in feed self._parser.parseFile(handle)
File "/bio/python2.2/lib/python2.2/site-packages/Martel/Parser.py", line
338, in parseFile self.parseString(fileobj.read())
File "/bio/python2.2/lib/python2.2/site-packages/Martel/Parser.py", line
366, in parseString self._err_handler.fatalError(result)
File "/bio/python2.2/lib/python2.2/xml/sax/handler.py", line 38, in
fatalError raise exception
Martel.Parser.ParserPositionException: error parsing at or beyond character 14
What is the reason for the problem? It seems that the problem is in the
parser part, but I just don't know why. Can anybody help?
Best wishes,
Jinling
More information about the BioPython
mailing list