[Biopython-dev] Lazy-loading parsers, was: Biopython GSoC 2013 applications via NESCent

Zhigang Wu zhigang.wu at email.ucr.edu
Thu May 2 08:14:04 UTC 2013


Hi Alex,

The idea of taking advantage of multiprocessing is great. I haven't touched
this kind of thing before and I think it's going to be cool to integrate
into the project.

Best,

Zhigang


On Wed, May 1, 2013 at 3:56 PM, Alex Leach <albl500 at york.ac.uk> wrote:

> Dear all,
>
> I also left some minor comments on the proposal; I hope they're helpful
> and I wish you every success!
>
> You should focus on the proposal for now, but I thought I'd share a more
> presentable version of the fasta lazy-loader I wrote a couple of years ago.
> The focus at the time was to minimise memory usage and increase the speed
> of random access to fasta-formatted sequences, stored on disk. Only
> sequence accessions and file locations are stored in-memory (in a dict).
> Once the index has been populated, it can 'pickle' the dictionary to a file
> on disk, for later re-use.
>
> It doesn't exactly fulfill all of your needs, but I hope it might help you
> in the right direction..
>
> Also, were there plans for making the lazy loader thread-safe? I've done
> it in the past by passing a `multiprocessing.Pipe` instance to a method
> (`pipe_sequences`) of the lazy loader. If redesigning the code, I'd try to
> implement a callback scheme, but passing a Pipe did the job.. Maybe it's
> outside the current scope of the project, but anyway, I put the module up
> on github if you want to check it out[1].
>
>
> Cheers,
> Alex
>
>
> [1] - https://github.com/alexleach/**fasta_lazy_loader/blob/master/**
> fasta_lazy_loader.py<https://github.com/alexleach/fasta_lazy_loader/blob/master/fasta_lazy_loader.py>
>
> ______________________________**_________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.**org <Biopython-dev at lists.open-bio.org>
> http://lists.open-bio.org/**mailman/listinfo/biopython-dev<http://lists.open-bio.org/mailman/listinfo/biopython-dev>
>



More information about the Biopython-dev mailing list