[BioRuby] BlastXML parser outputs RDF/JSON etc.

Pjotr Prins pjotr.public14 at thebird.nl
Sat Sep 6 12:08:36 UTC 2014


Hi all,

One of my oldest gems gets a new life :)

I revamped the bioblastxml parser to produce any type of RDF, JSON,
csv etc. All that needs to be done is write (or use an existing) ERB
template. JSON example:

  https://github.com/pjotrp/blastxmlparser/blob/master/template/blast2json.erb

Also the bioblastxml parser makes use of multicore parallelism.
It is probably one of the fastest BLAST XML parser around.

The strategy of a command line interface, lazy parsing, parallelism
throuth the Parallel gem and flexible output with ERB I consider core
strategies for bioinformatics gems. The good news is that it is
surprisingly easy to do! 

Have a look at the source code:

  https://github.com/pjotrp/blastxmlparser/blob/master/bin/blastxmlparser

and the README

  https://github.com/pjotrp/blastxmlparser
  
When the Parallel gem is not found the parser defaults to single thread.

I'll add these features to the bio-vcf and bio-table gems too in the
near future. With releases to match.

Pj.



More information about the BioRuby mailing list