[Bioperl-l] bioperl-db

Aaron J Mackey Aaron J. Mackey" <amackey@virginia.edu
Wed, 13 Mar 2002 08:57:56 -0500 (EST)


On Wed, 13 Mar 2002, Ewan Birney wrote:

> Heikki is I believe going to being looked into profiling Rec::Descent
> which maybe faster (we effectively have a hand-coded LLR parser in
> embl/genbank and I can't believe we've done a good job).

I went down this path about a year ago when Jason asked me to think about
rewriting the FTHelper code under Parse::RecDescent.  The "naive" parser
one first writes is full of logical specification that is very easy for a
human to read, and make changes to, but in terms of the state machine that
gets built, a bit redundant.  In other words, it's slow as all hell.
"Tuning" the grammar by optimizing away lots of unnecessary states helps
speed considerably, but it all ends up turning into something alot less
readable (and looking more and more like the original mass of regexps).

I briefly looked into writing a traditional C-based, compiled yacc grammar
(which could then be bootstrapped in via Inline::C), but as always, my
thesis committee somehow thinks that real experiments are more important
than open source software :(

-Aaron

-- 
 Aaron J Mackey
 Pearson Laboratory
 University of Virginia
 (434) 924-2821
 amackey@virginia.edu