[BioPython] import error
James Swetnam
jswetnam at gmail.com
Tue Mar 11 22:21:05 UTC 2008
Hello.
First off, apologies if my problem has been resolved in a previous
mailing; the archives search on the OBF wiki is disabled. Also, it's
quite possible i'm doing something boneheaded, as I still consider
myself a fairly novice python programmer. So apologies if I make you
read through this just to correct an indentation error or somethinig
similar!
I'm trying to use the Biopython BioSQL bindings to populate a locally
served MySQL database with what I like to call 'chimeric' SeqRecord
objects. I take as a starting point a large, FASTA formatted file of
short, translated (~35AA) protein sequences from the LANL HIV Sequence
Database. Every one of these LANL protein sequences is a subset of a
longer sequence available in genbank. Each of the sequences I
download thus has an associated genbank accession number.
I'd like to combine both the specificity afforded by the LANL
sequences with the 'meta' information given by the genbank files into
one record for each translated protein sequence. Thus, in very broad
pseudocode, my procedure is as follows:
for every sequence in fasta formatted lanl file
get the genbank number
grab the genbank file and parse into a SeqRecord
replace the Seq object in the genbank SeqRecord with the LANL protein
sequence
let Biopython do its magic and populate my biosql database with my
chimeric SeqRecord
...
Profit!
The entire procedure is rather short, thanks to the developers' hard
work and the magic of abstraction. Here's the actual code:
http://pastebin.com/m118199fe
OK. FIne. But I'm getting an error when I do this, which originates
deep in the bowels of the MySQLdb library, which I'd rather not touch
without a lot more coffee than I have available.
-----------------------------degas:v3_sequence_browser james$ ipython
populate_database.py
/sw/lib/python2.5/site-packages/Bio/config/DBRegistry.py:149:
DeprecationWarning: Concurrent behavior has been deprecated, as this
functionality needs Bio.MultiProc, which itself has been deprecated.
If you need the concurrent behavior, please let the Biopython
developers know by sending an email to biopython-dev at biopython.org to
avoid permanent removal of this feature.
DeprecationWarning)
---------------------------------------------------------------------------
<type 'exceptions.TypeError'> Traceback (most recent call
last)
/Users/james/src/v3_sequence_browser/populate_database.py in <module>()
35
36 db = server.new_database("v3")
---> 37 db.load(v3prod)
38 server.adaptor.commit()
39
/sw/lib/python2.5/site-packages/BioSQL/BioSeqDatabase.py in load(self,
record_iterator)
412 break
413 num_records += 1
--> 414 db_loader.load_seqrecord(cur_record)
415
416 return num_records
/sw/lib/python2.5/site-packages/BioSQL/Loader.py in
load_seqrecord(self, record)
28 """Load a Biopython SeqRecord into the database.
29 """
---> 30 bioentry_id = self._load_bioentry_table(record)
31 self._load_bioentry_date(record, bioentry_id)
32 self._load_biosequence(record, bioentry_id)
/sw/lib/python2.5/site-packages/BioSQL/Loader.py in
_load_bioentry_table(self, record)
248 division,
249 description,
--> 250 version))
251 # now retrieve the id for the bioentry
252 bioentry_id = self.adaptor.last_id('bioentry')
/sw/lib/python2.5/site-packages/BioSQL/BioSeqDatabase.py in
execute(self, sql, args)
275 """Just execute an sql command.
276 """
--> 277 self.cursor.execute(sql, args or ())
278
279 def get_subseq_as_string(self, seqid, start, end):
/sw/lib/python2.5/site-packages/MySQLdb/cursors.py in execute(self,
query, args)
149 query = query.encode(charset)
150 if args is not None:
--> 151 query = query % db.literal(args)
152 try:
153 r = self._query(query)
<type 'exceptions.TypeError'>: not all arguments converted during
string formatting
WARNING: Failure executing file: <populate_database.py>
Any direct help or references are much appreciated.
James Swetnam
Research Technician
Department of Pharmacology
NYU School of Medicine
More information about the Biopython
mailing list