[Biopython-dev] sff reader

Blanca Postigo Jose Miguel jblanca at btc.upv.es
Fri Aug 14 06:01:42 UTC 2009


Mensaje citado por Peter <biopython at maubp.freeserve.co.uk>:

> On Thu, Aug 13, 2009 at 8:40 PM, Blanca Postigo Jose
> Miguel<jblanca at btc.upv.es> wrote:
> >
> >> This will dovetail nicely with the indexing support in Bio.SeqIO
> >> which I am working on for Biopython 1.52, branch on github.
> >> I expect to have fast random access to reads in an SFF file
> >> very soon. See http://github.com/peterjc/biopython/tree/convert
> >
> > I've written some code to solve a similar problem. Maybe you
> > could take a look to it. It's in the classes FileIndex and
> > FileSequenceIndex at:
> >
> >
> http://bioinf.comav.upv.es/svn/biolib/biolib/src/biolib/biolib_seqio_utils.py
> >
>
> Did you see this thread?
> http://lists.open-bio.org/pipermail/biopython/2009-June/005281.html
>
> The coding style is quite different, but it looks the essential idea
> is the same - we both scan the file to find each record, and use
> a dictionary to record the offset. Interestingly you and Peio also
> keeps the record's length in the dictionary, which will double the
> memory requirements - for something you don't actually need.
>
> Peter
>
> P.S. You can forward or CC this back to the list if you like.

We keep the record length to be able to return the record without having to scan
the file again.

Jose Blanca



More information about the Biopython-dev mailing list