[BioRuby] GSOC: Bioruby PhyloXML update 12

Christian M Zmasek czmasek at burnham.org
Fri Aug 14 18:41:13 UTC 2009


Very nice!!

I guess this will be placed/copied to the BioRuby tutorial (at 
http://bioruby.open-bio.org/wiki/Tutorial) at one point, correct?



A very tiny, minuscule even, issue I noticed (and maybe even a problem 
of my web browser):

A the very end, the blue box seems broken at the "wrong place" -- so to 
speak, i.e.

"#Once we know whats there, lets output just sequences
phyloxml.other[0].children.each do |node|
   puts node.value
end"

and

"#=>
#
#acgtcgcggcccgtggaagtcctctcct
#aggtcgcggcctgtggaagtcctctcct
#taaatcgc--cccgtgg-agtccc-cct"

should be in the same box, but they appear to be in different ones.

Christian





Diana Jaunzeikare wrote:
> Hi all,
> 
> I added here a HOWTO for BioRuby PhyloXML implementation
> 
> https://www.nescent.org/wg_phyloinformatics/BioRuby_PhyloXML_HowTo_documentation
> 
> Let me know, what you think
> 
> Diana
> 
> On Mon, Aug 10, 2009 at 3:54 PM, Diana Jaunzeikare <rozziite at gmail.com 
> <mailto:rozziite at gmail.com>> wrote:
> 
>     Hi all,
> 
>     What was done last week:
> 
>     * Coding. Added changes so that now it is completely compatible with
>     phyloxml schema 1.10
> 
>     * Testing. added more unit tests (now writer has 9 tests, 26
>     assertions; parser: 40 tests, 134 assertions)
> 
>     * Profiling. I discovered that writer is really slow. The reason is
>     the implementation of the Tree#children method, which does
>     bfs_shortest_path algorithm. I had idea of tracking node children
>     inside the node class as an array, but Naohisa Goto pointed out that
>     then I would also have to deal with new node, edge addition,
>     removal, etc. So better solution seems to, for now leave it as it
>     is, and first improve Bio::Tree class. I am planning to do that
>     after GSOC, since there is only one week left.
> 
>     * Refactored parser class, got around 3-fold speed increase. Now it
>     can parse Metazoa taxonomy 33MB file in ~14 seconds (Ubuntu 9.04,
>     ruby 1.8.7 [i486-linux], Intel Core 2 Duo P8600 @2.4GHz)
> 
>     Next week:
> 
>     * Create howto wiki page with code examples and usage.
>     * Do more testing (Anybody has some more phyloxml xml files for me
>     to test, other than those on phyloxml.org <http://phyloxml.org>?)
>     * Any other suggestions from you?
> 
>     Questions/issues:
> 
>     * Where should the HOWTO and code example documentation go? Seems
>     reasonable for it to go here
>      http://bioruby.open-bio.org/wiki/HOWTO:Trees and/or
>     http://bioruby.open-bio.org/wiki/Phyloxml_tree_format (which is
>     linked from previous link).
> 
>     * How does integration to the master branch goes? Is all i have to
>     do is pull_request on github?
> 
>     * I have implemented PhyloXML::Sequence#to_biosequence, however it
>     returns incomplete data, since info for
>     Bio::Sequence#classification, Bio::Sequence#species,
>     Bio::Sequence#division would come from PhyloXML::Taxonomy class, but
>     it is not accessible from Sequence class. Should there be
>     PhyloXML::Node#to_biosequence method which would gather information
>     from both PhyloXML::Sequence and PhyloXML::Taxonomy? or maybe
>     Bio::Sequence should not hold taxonomic information?
> 
>     You are all welcome to test my code. It is available on
>     http://github.com/latvianlinuxgirl/bioruby/tree/dev
> 
>     Thanks,
> 
>     Diana
> 
> 




More information about the BioRuby mailing list