[Bioperl-l] SearchIO speed up
Chris Fields
cjfields at uiuc.edu
Fri Aug 11 13:48:59 UTC 2006
> Anyway, in your spare time, maybe you do similar speedups for other
> pieces of Bioperl? My personal favorite would be the GenBank/EMBL
> parsers. The fungal genome ORF files I'm working with are only 20M or
> so, but using Bioperl to work with them takes so much longer than with
> non-Bioperl on the 6M FASTA files for other genomes. I have to imagine
> it's mostly creating objects for the gazillion tags, 90% of which I
> never peek at.
I agree completely. Swissknife (lazy parsing of Swiss-Prot) was mentioned
here yesterday. We could use something similar for GenBank/EMBL. The code
for Swissknife was quite extensive but, really, so is SeqIO::genbank!
I also wanted to see how much using bioperl's _readline() method slows
things down (my guess is not too dramatically, but for 20 MB files it may be
a problem).
> I know, you folks are busy, and I should be volunteering to do it
> myself. But you can at least consider it a user request.
We can't promise anything! If you want, add a bit to the Bioperl release
page:
http://www.bioperl.org/wiki/Bioperl_Release
I would hold that request off until post-1.6. Lots of other priorities
pooping up.
Chris
> - Amir Karger
> Research Computing
> Bauer Center for Genomics Research
> Harvard University
...
More information about the Bioperl-l
mailing list