[Biopython] SQL Alchemy based BioSQL

Kyle Ellrott kellrott at gmail.com
Wed Aug 26 01:01:30 UTC 2009


I've added a new database function lookupFeature to quickly search for
sequences features without have to load all of them for any particular
sequence.
Because it's a non-standard function, I've taken the opportunity to
play around with some more dynamic search features.
Once we get the interface for these types of searches locked down on
lookupFeature, a similar system could be implemented in the standard
'lookup' call.
The work is posted at http://github.com/kellrott/biopython

The following is an example of a working search, that pulls all of the
protein_ids from NC_004663.1 between 60,000 and 70,000 on the positive
strand.


import sys
from BioSQL import BioSQLAlchemy as BioSeqDataBase

server = BioSeqDataBase.open_database( driver="mysql", user='test',
host='localhost', db='testdb' )
db = server[ 'bacteria' ]

seq = db.lookup( version="NC_004663.1" )

features = db.lookupFeatures( BioSeqDataBase.Column('strand') == 1,
	BioSeqDataBase.Column('start_pos') < 70000,
	BioSeqDataBase.Column('end_pos') > 60000,
	bioentry_id = seq._primary_id, name="protein_id" )

#print len(features)
for feature in features:
	print feature


> Kyle:
>> > I've posted a git fork of biopython with a BioSQL system based on SQL
>> > Alchemy.  It can be found at git://github.com/kellrott/biopython.git
>> > It successfully completes unit tests copied from test_BioSQL and
>> > test_BioSQL_SeqIO.




More information about the Biopython mailing list