[BioPython] Biopython object serialization

Estienne Swart estienne@sanbi.ac.za
Tue, 12 Nov 2002 12:29:33 +0200


--------------040100030101050203060904
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit

Hi All,

I've been wondering about a decent way of storing biopython objects for 
some time now. It looks like there has been some progress (CVS) on 
interfacing Biopython with a relational DB system, but is this the best 
approach? For instance, say you'd actually like to store sequences 
within the database (which require one of the large text field types), 
you then find yourself having to deal with relatively long data 
retrieval times (if memory serves me right, it takes on the order of a 
couple seconds to retrieve a single sequence from a database containing 
a few thousand entries, with sequences stored in the medium text field).

Have any of the biopython developers attempted/considered using an 
object database, such as ZODB, or at least assessed the relative merits 
of some different approaches to data storage/object persistence?

I recently came across an article about object persistence 
(http://www-106.ibm.com/developerworks/linux/library/l-pypers.html), by 
Patrick O'Brien (the name should ring a bell to those of you that read 
his O'Reilly article on Bioinformatics). He advocates the use of his own 
solution to persistence,
PyPerSyst <http://sourceforge.net/projects/pypersyst/>, which is 
supposedly faster than ZODB, and simpler to implement too.

Do you think that some benchmarking would be in order (not that I'm 
volunteering)?

What course will Biopython be persuing in the near future (as far as 
object serialization is concerned)? Is there room for alternatives 
besides those that use relational databases, i.e. will they be 
competitive as far as performance is concerned.

Cheers

Estienne

--
Estienne Swart
SANBI, UWC Private Bag X17, Bellville 7535
estienne@sanbi.ac.za
tel work: +27 21 959 3908
tel home: +27 21 448 8118
fax work: +27 21 959 2512



--------------040100030101050203060904
Content-Type: text/html; charset=us-ascii
Content-Transfer-Encoding: 7bit

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
  <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1">
  <title></title>
</head>
<body>
Hi All,<br>
<br>
I've been wondering about a decent way of storing biopython objects for some
time now. It looks like there has been some progress (CVS) on interfacing
Biopython with a relational DB system, but is this the best approach? For
instance, say you'd actually like to store sequences within the database
(which require one of the large text field types), you then find yourself
having to deal with relatively long data retrieval times (if memory serves
me right, it takes on the order of a couple seconds to retrieve a single
sequence from a database containing a few thousand entries, with sequences
stored in the medium text field).<br>
<br>
Have any of the biopython developers attempted/considered using an object
database, such as ZODB, or at least assessed the relative merits of some
different approaches to data storage/object persistence?<br>
<br>
I recently came across an article about object persistence (<a class="moz-txt-link-freetext" href="http://www-106.ibm.com/developerworks/linux/library/l-pypers.html">http://www-106.ibm.com/developerworks/linux/library/l-pypers.html</a>),
by Patrick O'Brien (the name should ring a bell to those of you that read
his O'Reilly article on Bioinformatics). He advocates the use of his own
solution to persistence, <br>
<a href="http://sourceforge.net/projects/pypersyst/">PyPerSyst</a>, which
is supposedly faster than ZODB, and simpler to implement too.<br>
<br>
Do you think that some benchmarking would be in order (not that I'm volunteering)?<br>
<br>
What course will Biopython be persuing in the near future (as far as object
serialization is concerned)? Is there room for alternatives besides those
that use relational databases, i.e. will they be competitive as far as performance
is concerned.<br>
<br>
Cheers<br>
<br>
Estienne<br>
<br>
<pre class="moz-signature" cols="$mailwrapcol">--<br>Estienne Swart<br>SANBI, UWC Private Bag X17, Bellville 7535<br><a class="moz-txt-link-abbreviated" href="mailto:estienne@sanbi.ac.za">estienne@sanbi.ac.za</a><br>tel work: +27 21 959 3908<br>tel home: +27 21 448 8118<br>fax work: +27 21 959 2512<br></pre>
<br>
</body>
</html>

--------------040100030101050203060904--