[GSoC] [BioRuby] GSoC week 2 status report
Raoul Bonnal
bonnal at ingm.org
Tue May 22 09:21:42 UTC 2012
Hi Clayton,
Well done and thanks for your contributes to bioruby and jruby community.
For you computing issue I have two solutions:
1) I can create a VM and give you the access, I need to contact my IT dep.
2) Could Amazon provide some VM for our students?
On 21/05/12 17.50, "Clayton Wheeler" <cswh at umich.edu> wrote:
> Hi all,
>
> Here's my report on last week's work:
>
> http://csw.github.com/bioruby-maf/blog/2012/05/21/week_2_progress/
>
> This was my second week of work on my GSoC project, and the last week of the
> community bonding¹ period before the official start of coding. A major focus
> of mine was BioRuby¹s phyloXML support; it uses libxml, which has been causing
> unit test failures under JRuby. In the end, the best course of action seemed
> to separate the phyloXML support as a separate plugin, which I have done as
> the bio-phyloxml gem. This will remove BioRuby¹s dependency on XML libraries
> entirely and that JRuby issue along with it. At the same time, users of the
> phyloXML code should be able to continue using it with no substantive changes.
>
> Separately, I began porting this phyloXML code to use Nokogiri instead of
> libxml-ruby, but ran into difficulties with this effort. While it is possible,
> and the library APIs are very similar, the code uses relatively low-level XML
> processing APIs in ways that seem to be sensitive to subtle differences in
> text node and namespace semantics between the two libraries. Substantial
> restructuring of the code and the addition of quite a few unit tests might be
> necessary to carry out such a port with confidence that the resulting code
> would work well.
>
> Also, someone else submitted a JRuby patch for JRUBY-6658, one of the major
> causes of BioRuby¹s unit test failures with JRuby; once a fix is integrated,
> we¹ll be close to having all the tests passing under JRuby.
>
> I identified another JRuby bug, JRUBY-6666, causing several unit test
> failures. This one affects BioRuby¹s code for running external commands, so it
> would be likely to be encountered in production use. For this one, I also
> worked up a patch.
>
> I also spent some time preparing a performance testing environment, for
> evaluating existing MAF implementations as well as my own. This will be
> important, since I will be considering the use of an existing C parser. I will
> also want to ensure that the performance of my code is competitive with the
> alternatives. Lacking any hardware more powerful than a MacBook Air, I am
> setting this up with Amazon EC2. To simplify environment setup, I¹ll be using
> Chef. I¹ve already set up a Chef repository with configuration logic, and some
> rudimentary code to streamline launching Ubuntu machines on EC2 and
> bootstrapping a Chef environment. To save money, I plan to make use of EC2
> Spot Instances, which are perfect for instances that only need to run for a
> few hours for batch tasks.
>
> Clayton Wheeler
> cswh at umich.edu
>
>
>
>
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
More information about the GSoC
mailing list