[BioRuby] GSoC weekly status report No.1.1

Marjan Povolni marian.povolny at gmail.com
Sat May 12 19:46:46 UTC 2012


Hi all,

Here is my status report for this week:

This year we the GSoC students sure are a very creative group, just look at
our numbering schemes for our status reports for the pre-coding period -
everyone has his own thing going :)

And now back to the GFF3 project. I found a few more sites with big GFF3
files, those will be great for performance testing. And Robert Buels
suggested that I should reuse the test suite from the Perl’s
Bio::GFF3::LowLevel::Parser, and I think that’s a great idea. I should
definitely use that for completeness testing and I will check the test
suites of other GFF3 parsers.

I have also finished the work for the first week. That means basically I’m
already more then two weeks ahead of schedule. The parser is now reading
data on the D side and forwarding that to Ruby line by line. That won’t be
faster then reading the file from Ruby, but that’s a nice basic case to get
data flowing from D to Ruby.

The rake tasks have been improved too. There are now two tasks for building
the D library, “compile” and “compiledebug”, and there is the “spec” task
for running rspec tests and “features” task for running cucumber tests. The
“clean” task now deletes object and library files.

There is also a problem with the D library and garbage collector. It seems
this is the problem Iain Buclaw (one of the GDC developers) has warned us
about. When using a D shared library, when the GC kicks in for the first
time, it looks like if it collects all the static data, for example the
per-module variables. And pretty much everything, even when we register
with GC a chuck of memory allocated with malloc, it still gets collected.
Or at least that’s what it looks like. However, Iain also assured us that
this will be solved by the end of this month/beginning of the next. My
cucumber and rspec tests still work because they don’t require enough
memory for the GC to run, but to be sure that this issue doesn’t interfere
with development at this point, I manually disabled the GC on
library initialization. I didn’t try yet, but from what has been discussed
in the forums, both 32 and 64-bit DLLs on windows built using DMD work fine.

I also helped Pjotr with getting our blog posts included in the RSS feed on
biogems.info.


That's all for now, you can find this report on my blog too:

http://blog.mpthecoder.com/post/22919943701/gsoc-weekly-status-report-no-1-1

--
Best regards,
Marjan




More information about the BioRuby mailing list