[Biopython] SQL Alchemy based BioSQL

Kyle Ellrott kellrott at gmail.com
Thu Aug 20 20:57:29 UTC 2009


> Sounds interesting - but can you explain your motivation?

The primary motivation is Jython compatibility (which is the main
purpose of the branch).  MySQLdb depends on some C extensions which
make it hard to port to Jython.  I don't keep track of IronPython, but
I would imagine it would be a similar situation on the .Net platform.
Beta SQLAlchemy 0.6 ( available on the SVN right now, but soon to be
released ) supports the MySQL Connector/Java interface, so it works
with Jython.  Using this combination was the only way I could get a
Jython BioPython to connect to a database.
As a technical note, now that this works, it means that you can use
BioPython and BioJava in the same memory space.  I used BioPython's
SQL code to get the data, and the passed it to BioJava's
Smith-Waterman alignment code to calculate alignments, all in one
script.

> But what I think I said then was that while I like SQLAlchemy,
> and have used it with BioSQL as part of a web application, I
> don't see that we need it for Biopython's BioSQL support. We
> essentially have a niche ORM for going between the BioSQL
> tables and the Biopython SeqRecord object.

Yes, but it's an ORM that only supports one form of Python.  Let
somebody else worry about wrapping to the details of other systems
like Jython.

> [That wasn't meant to come across as negative, I'm just
> wary of adding a heavyweight dependency without a good
> reason]

It doesn't have to replace the existing system.  It can sit along
side, and not get installed if SQL Alchemy isn't available.
If we leave the naming as is, it won't effect anybodies code.  But it
they do want to use it, it can replace the original system in a script
call:
from BioSQL import BioSQLAlchemy as BioSeqDatabase
from BioSQL import BioSeqAlchemy as BioSeq

And it should work exactly the same.

> Something I would be interested in is a set of SQLAlchemy
> model definitions for the BioSQL tables (ideally database
> neutral). I've got a very preliminary, partial and minimal
> set done - and I think Brad has some too. This would be
> useful for anyone wanting to go beyond the Biopython
> SeqRecord based BioSQL support.

Yes, way the SQL Alchemy sets up python data structures based the
structure of the database opens up a lot of  cool ways to dynamically
create queries.


Kyle



More information about the Biopython mailing list