[Biojava-dev] parsing bottleneck

Matthew Pocock matthew_pocock at yahoo.co.uk
Wed Mar 5 11:23:41 EST 2003


Hi,

I've just run refseq through our parsers. It takes me
8m30s to process the 2.2Gb genbank-formatted file
rscu.gbff, and the process uses between 180 and 200Mb
due to some whole arabadopsis genomes being in there.

Top thinks the process is running at prety much 100%
cpu for all that time. -Xprof reccons 16% of this is
in StringBuffer.charAt(), which I presume is being
called in the FeatureTableParser class.

I'm going to have a tinker. If anybody has ideas,
please tell me.

Matthew

 

__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com


More information about the biojava-dev mailing list