[Biojava-l] regex performance in Java

Hilmar Lapp hlapp at drycafe.net
Mon Oct 22 14:52:24 UTC 2012

I know that this is really Java language topic, but since parsing biological data formats is to rife with regular expression applications, I'm curious what the experience is among the Biojava people with the use of regular expressions in Java. 

They (at least as in java.util.regex) have been reported to me as performing much slower (by several orders of magnitude) than the regex implementation in Perl, and some simple benchmarking tests seem to bear that out. Even after scrutinizing the benchmark and finding nothing obvious, I'm still skeptical as to why this would be the case - naively I would have assumed that the underlying runtime library is implemented in C in both cases. But perhaps this is not true?

Any experience people have made here speed-wise (or tricks or things not to do for Java regex's) would be appreciated.

: Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :

More information about the Biojava-l mailing list