[BioRuby] Parsing line-based formats with Ragel

Pjotr Prins pjotr.public14 at thebird.nl
Mon Jun 4 05:17:45 UTC 2012


On Mon, Jun 04, 2012 at 12:56:18AM +0000, Fields, Christopher J wrote:
> Have to agree, and in cases where a Bio* might run into problems
> with Ragel (Perl or Python) we can at least look at the grammar and
> use something for those languages that is similar in concept (e.g.
> Marpa for Perl), or go a little more roundabout and bind to
> C-generated ones from Ragel.

Also agree. Parsing is a common theme in Bio*. A state engine would
be a great abstraction, targetting C or D, and even the interpreted
languages. The SAM parser would be a great proof-of-concept. I am
also very interested to see how it will perform against samtools.

The spanner in the works may be that we tend to be very sloppy about
standards. So relaxed parsers may also be needed.

Pj.



More information about the BioRuby mailing list