[Biopython-dev] [Biopython - Bug #3395] Biopython trie implementation can't load large data sets

redmine at redmine.open-bio.org redmine at redmine.open-bio.org
Thu Nov 29 17:12:31 UTC 2012


Issue #3395 has been updated by Peter Cock.

File trie_debug.patch added

I can reproduce the problem with your saved file under Mac OS X, using the latest Biopython from github, e.g.

$ python
Python 2.7.2 (default, Jun 20 2012, 16:23:33) 
[GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from Bio import trie
>>> import gzip
>>> with gzip.open("trie.4.dat.gz") as handle:
...     t = trie.load(handle)
... 
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
RuntimeError: loading failed for some reason

Adding a little debugging to the C code tells us where this fails (see attachment), line 669:

668    if(has_value) {
669        if(!(trie->value = (*read_value)(data)))
670            goto _deserialize_trie_error;
371    }

What kind of CPU does your machine have? i.e. is it a normal Intel or AMD CPU, or something unusual like a PowerPC where we have to worry about the bit order interpretation?

We may need a complete example creating the trie as well - the problem could be in the trie itself, the serialisation (writing to disk), or de-serialisation (loading from disk).
----------------------------------------
Bug #3395: Biopython trie implementation can't load large data sets
https://redmine.open-bio.org/issues/3395

Author: Michał Nowotka
Status: New
Priority: Normal
Assignee: Biopython Dev Mailing List
Category: Main Distribution
Target version: 
URL: 


Imagine I have Biopython trie:

from Bio import trie
import gzip

f = gzip.open('/tmp/trie.dat.gz', 'w')
tr = trie.trie()
#fill in the trie
trie.save(f, trie)

Now /tmp/trie.dat.gz is about 50MB. Let's try to read it:

from Bio import trie
import gzip

f = gzip.open('/tmp/trie.dat.gz', 'r')
tr = trie.load(f)

Unfortunately I'm getting meaningless error saying:
"loading failed for some reason"

Any hints?



-- 
You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here and login: http://redmine.open-bio.org




More information about the Biopython-dev mailing list