[BioRuby] Parsing large Blast xml files - a new bioruby plugin

Wed Jun 1 10:26:19 UTC 2011

what about to automate this process on our wiki :-)?

$# gem search -r bio-

bio-assembly (0.1.0)
bio-blastxmlparser (0.6.1)
bio-bwa (0.2.2)
bio-cnls_screenscraper (0.1.0)
bio-emboss_six_frame_nucleotide_sequences (0.1.0)
bio-gem (0.2.2)
bio-genomic-interval (0.1.2)
bio-gex (0.0.0)
bio-gff3 (0.8.6)
bio-graphics (1.4)
bio-hello (0.0.0)
bio-isoelectric_point (0.1.1)
bio-kb-illumina (0.1.0)
bio-lazyblastxml (0.4.0)
bio-logger (0.9.0)
bio-nexml (0.0.1)
bio-octopus (0.1.1)
bio-samtools (0.2.1)
bio-sge (0.0.0)
bio-tm_hmm (0.2.0)
bio-ucsc-api (0.0.4)

wow quite long list of plugins :-) I'm happy to see this boiling soup

On 01/giu/2011, at 10.49, Pjotr Prins wrote:

> The general idea is to have a number of 'blessed' plugins tied to
> BioRuby releases. A blessed plugin is supposed to be rather solid,
> and have a level of documentation and testing.
> 
> In addition there are 'development' plugins. Both should be listed on
> the plugin page. We are introducing that plumbing shortly. The
> duplication of work merely points out we need to get this done ;)
> 
> It is interesting to note both XML parsers use lazy iterators. I also
> do lazy conversions. Same for my GFF3 plugin. Rob, be good to compare
> performance on some real-life data.
> 
> Pj.
> 
> On Wed, Jun 01, 2011 at 04:33:36PM +0800, Rob Syme wrote:
>> I think that the list at
>> http://bioruby.open-bio.org/wiki/BioRuby_Plugins is pretty
>> comprehensive, my mistake was simply not looking.
>> -r
>> 
>> 
>> On Wed, Jun 1, 2011 at 4:25 PM, Philipp Comans
>> <philipp.comans at googlemail.com> wrote:
>>> Hi,
>>> 
>>> I had a similar problem recently. I needed an efficient parser for Blast XML results and I discovered that the default parser in BioRuby was not suitable. So I wrote my own using Nokogiri.
>>> In my opinion it is way too hard at the moment to discover BioPlugins. When people use the default XML or GFF parser that comes with BioRUby, they do not expect that there is another, more efficient version. There should be a section on the front page or even in the corresponding parts of the API documentation that makes people aware of the existence of these efficient parsers.
>>> 
>>> BTW thank you all for BioRuby, I used in a project recently and it made my life tremendously easier.
>>> 
>>> Cheers,
>>> 
>>> Philipp
>>> 
>> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

--
The only change to succeed is starting from a simple thing.