[BioRuby] GSOC: phyloXML for BioRuby: Code Review

Christian M Zmasek czmasek at burnham.org
Tue Jun 23 19:29:42 UTC 2009


Hi, Diana:

Diana Jaunzeikare wrote:
> Hi all,
>
> In the Google Summer of Code project I have reached a stage where most 
> of the code has been written for PhyloXML parser and I would like to 
> ask for code review.
>
> I would like to know answers to these questions:
>
> * What parts should have more documentation?
Node might benefit from a more detailed description of all its (sub-) 
elements.
Also, you don't always use the "rdoc" format.
Some documentation sare hard to read (such as the one for Sequence), 
simply because of the way the text is formatted.
In general, I would point out the dependency on libxml2 more prominently.


>
> * Are there any places where code could be made more rubyish?
Maybe core-BioRuby developers can give an answer for this one. Looks 
like Ruby to me, but I started off programing with C++ and Java -- so, I 
might be biased  ;)



>
> * Are the structure of unit tests fine, or there are some conventions 
> which my code doesn't follow?
I think it would be best to add more tests for "marginal"/error cases 
(for parsing).
Listed in increasing severity:
Are empty elements handled properly (e.g. <taxonomy></taxonomy>)?
What about new-lines, tabs, non-printable ascii characters in place 
where text is expected? Trailing and leading whitespaces? Does this get 
trimmed of?
Valid XML documents violating phyloXML specs?
Invalid XML?
All these should be handled gracefully.


>
> * Is code readable?

Yes.


> * Are there any conventions that I don't follow? (like lines should 
> strictly fit into 80 columns)?
>
> Any comments would be appreciated.
>
> Code is available on github 
> http://github.com/latvianlinuxgirl/bioruby/tree/dev  in 
> *lib/bio/db/phyloxml.rb* and *test/unit/bio/db/test_phyloxml.rb* files.
>
>
> Diana
>

Furthermore, it might be a good time to start testing your 
parser/objects against really large files.
This might help to uncover potential hidden problems.
Obviously, you could not add such large files to BioRuby's test files. 
But it would still be nice to know how your parser and objects scale....

Also, I am not sure if it's such a great idea to have all your classes 
in the same file/directory (i.e. both parser _and_ data objects).

Right now, if the libxml2 gem is not install the test for the whole of 
bioruby exits.

Christian




More information about the BioRuby mailing list