[Bioperl-l] Announcing Bio::SFF

Leon Timmermans l.m.timmermans at students.uu.nl
Tue Dec 20 15:26:11 UTC 2011


On Mon, Dec 19, 2011 at 8:44 PM, Fields, Christopher J <
cjfields at illinois.edu> wrote:

> Kinda joining this a little late, but I think if there is a way to have a
> low-level parser/writer that generically parses the data into simple
> (possibly hash-tagged) data structures, that would be best.  Barring that,
> a very simple class for storing data.  We've found BioPerl objects/classes
> pretty heavy.
>
> (for an example of this, see Heng Li's readfq parser on github, which has
> some stats for Fastq/fasta parsing).
>
> Any way we can separate the parser from object instantiation would enable
> us to optimize the object/class layer and parser/writer layers separately,
> with the possible nice side effect of making the parser more broadly used.
>
> For insn Sance, if someone wanted a faster parser, use the low level,
> otherwise use the higher level (possibly BioPerl-specific) API. Lincoln
> does this do a certain degree with Bio-samtools; I would go further and
> make the bp- and non-bp code in separate dists.
>

A good OO system can actually help make things faster. For example, I'm
unpacking the flowspace and quality data lazily, which made scanning
through an SFF file 2.5-3 times as fast while having marginal extra costs
when you do need them.

Leon



More information about the Bioperl-l mailing list