[Biojava-l] regex performance in Java

Andreas Prlic andreas at sdsc.edu
Mon Oct 22 20:42:00 UTC 2012

Hi Hilmar,

I can't say much about performance of java regular expressions, but
isn't it hard to write efficient regular expressions in any language?
There are great tools in the Java world for parsing of XML/JSON and
other standard file types that help avoiding them. I am not sure if
this is a general rule for the wider Java community, but from my
perspective, the use of regular expressions in Java is only limited
and used if nothing else works... Not sure if anybody else has a
different experience?


On Mon, Oct 22, 2012 at 7:52 AM, Hilmar Lapp <hlapp at drycafe.net> wrote:
> I know that this is really Java language topic, but since parsing biological data formats is to rife with regular expression applications, I'm curious what the experience is among the Biojava people with the use of regular expressions in Java.
> They (at least as in java.util.regex) have been reported to me as performing much slower (by several orders of magnitude) than the regex implementation in Perl, and some simple benchmarking tests seem to bear that out. Even after scrutinizing the benchmark and finding nothing obvious, I'm still skeptical as to why this would be the case - naively I would have assumed that the underlying runtime library is implemented in C in both cases. But perhaps this is not true?
> Any experience people have made here speed-wise (or tricks or things not to do for Java regex's) would be appreciated.
>         -hilmar
> --
> ===========================================================
> : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
> ===========================================================
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l

More information about the Biojava-l mailing list