[BioRuby] [GSoC][NeXML and RDF API] Update

Anurag Priyam anurag08priyam at gmail.com
Thu Jun 3 09:00:06 UTC 2010


Hello all,

I know this update is coming quite late. Sorry for holding this back for so
long. From now on I will be updating this list weekly on my progress. Just
to keep everyone in the loop, [1] is my project page.

What has been done?
Till now I have been able to do a significant amount of work on the NeXML
parser. The parser recognizes otus, otu and trees. The trees implementation
is not complete as per the NeXML schema. Trees with multiple rootings,
coalescent trees and networks remain to be done.

Problems Faced:
Initially it was decided to stream parse any NeXML document as DOM parsing
would be slow for larger documents. But with NeXML's non linear design,
streaming seems non natural and proves to be a little difficult. Currently,
I have written a wrapper over the StAX parsing API of libxml but the entire
document is parsed in one go; at the start.

Current git head[2] can be built and the code tested out. A tutorial( kind
of ) on how to use the NeXML can be found here[3].

[1]
https://www.nescent.org/wg_phyloinformatics/Category:NeXML_and_RDF_API_for_BioRuby
[2] http://github.com/yeban/bioruby
[3]
https://www.nescent.org/wg_phyloinformatics/NeXML_and_RDF_API_for_BioRuby

-- 
Anurag Priyam,
2nd Year Undergraduate,
Department of Mechanical Engineering,
IIT Kharagpur.
+91-9775550642



More information about the BioRuby mailing list