[Bioperl-l] storing/retrieving a large hash on file system?

Sean Davis sdavis2 at mail.nih.gov
Tue May 18 16:39:44 UTC 2010


On Tue, May 18, 2010 at 12:09 PM, Sean Davis <sdavis2 at mail.nih.gov> wrote:

>
>
> On Tue, May 18, 2010 at 11:28 AM, Ben Bimber <bimber at wisc.edu> wrote:
>
>> this question is more of a general perl one than bioperl specific, so
>> I hope it is appropriate for this list:
>>
>> I am writing code that has two steps.  the first generates a large,
>> complex hash describing mutations.  it takes a fair amount of time to
>> run this step.  the second step uses this data to perform downstream
>> calculations.  for the purposes of writing/debugging this downstream
>> code, it would save me a lot of time if i could run the first step
>> once, then store this hash in something like the file system.  this
>> way I could quickly load it, when debugging the downstream code
>> without waiting for the hash to be recreated.
>>
>> is there a 'best practice' way to do something like this?  I could
>> save a tab-delimited file, which is human readable, but does not
>> represent the structure of the hash, so I would need code to re-parse
>> it.  I assume I could probably do something along the lines of dumping
>> a JSON string, then read/decode it.  this is easy, but not so
>> human-readable.  is there another option i'm not thinking of?  what do
>> others do in this sort of situation?
>>
>> thanks in advance.
>>
>>
> There are a number of solutions on CPAN, probably.  This is one maybe off
> the beaten path, but it is getting a lot of press in the NoSQL database
> realm:
>
> http://1978th.net/tokyocabinet/
>
>
Just to be clear, I am assuming that the problem at hand is storing a
key/value pair and then retrieving it later.  If what you are talking about
is a multi-level hash data structure, then Data::Dumper might be the easiest
way to go.

Sorry for the confusion....

Sean



More information about the Bioperl-l mailing list