[BioRuby] Parsing line-based formats with Ragel

Fields, Christopher J cjfields at illinois.edu
Mon Jun 4 00:56:18 UTC 2012


Have to agree, and in cases where a Bio* might run into problems with Ragel (Perl or Python) we can at least look at the grammar and use something for those languages that is similar in concept (e.g. Marpa for Perl), or go a little more roundabout and bind to C-generated ones from Ragel.

chris

On Jun 3, 2012, at 5:10 PM, Ben Woodcroft wrote:

> Just wanted to say thanks for pointing this out Artem - can definitely see
> myself using it in the future. If only you'd been a few days earlier!
> 
> Perhaps idealistically, the state machine might be written once, and then
> the last mile be implemented in multiple different Bio* projects.
> 
> On 4 June 2012 00:20, Pjotr Prins <pjotr.public14 at thebird.nl> wrote:
> 
>> Trust a CS student to start on finite state machines. For us mere
>> mortals, here is a good write-up on Ragel principles for Rubyists
>> 
>> http://zedshaw.com/essays/ragel_state_charts.html
>> 
>> by the much loved Zed :)
>> 
>> Pj.
>> 
>> On Sat, Jun 02, 2012 at 05:06:12PM +0400, Artem Tarasov wrote:
>>> Hi guys,
>>> 
>>> I've recently discovered absolutely cool thing called Ragel (
>>> http://www.complang.org/ragel/). It is a finite state machine compiler,
>> its
>>> applications include parsing Cucumber features in Gherkin, parsing HTTP
>>> requests in Mongrel, and implementing pack/unpack functions in Rubinius.
>>> 
>>> It can be used for creating parser for any regular language, that
>> includes
>>> nearly every line-based format. It generates code for C, C++, Objective
>> C,
>>> D(!), Java, and Go. The speed of generated code is incredible.
>>> 
>>> I wrote a few words more about it in my blog:
>>> http://lomereiter.wordpress.com/2012/06/02/ragel-and-bioinformatics/
>>> 
>>> Basically, you write a formal grammar, define which snippets of code to
>>> execute on state transitions, and everything just works. As for me, I'm
>>> going to implement SAM parser with this tool.
>>> 
>>> It can also be useful for Marjan. I wrote a GFF3 grammar, but it might be
>>> incorrect in some places. Here's a basic example of usage:
>>> https://github.com/lomereiter/bioragel/blob/master/examples/d/gff3.rl
>>> 
>>> 
>>> 
>>> --
>>> Artem
>>> _______________________________________________
>>> BioRuby Project - http://www.bioruby.org/
>>> BioRuby mailing list
>>> BioRuby at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioruby
>>> 
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
>> 
> 
> 
> 
> -- 
> --
> Ben Woodcroft
> http://ecogenomic.org/users/ben-woodcroft <http://www.ecogenomic.org/>
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby





More information about the BioRuby mailing list