[Biopython-dev] Gsoc 2014: another aspirant here

Peter Cock p.j.a.cock at googlemail.com
Fri Mar 14 13:34:40 UTC 2014


On Fri, Mar 14, 2014 at 5:30 AM, Wibowo Arindrarto
<w.arindrarto at gmail.com> wrote:
> Hi Evan,
>
> Focusing on the SeqIO parsers is ok. That's where having lazy parsers
> would help most (and you've got a handful of formats there already).
> Remember that you'll also need to account for time to write tests,
> possibly benchmark or profile the code (lazy parsers should improve
> performance after all), and write documentation, outside of writing
> the code itself. You'll also want to be clear about this in your
> proposed timeline, since that will be your main guide during the
> coding period.
>
> Looking forward to reading your proposal :),
> Bow

Yes, profiling will be important here - if your script accesses all
the annotation/sequence/etc of a record, then the lazy parser
will probably be slower (all the same work, plus an overhead).
It should win when only a subset of the data is needed, both
in terms of speed and memory usage.

Peter



More information about the Biopython-dev mailing list