[BioRuby] [PhyloSoC] Project plan for phyloXML integration with BioRuby

Christian M Zmasek czmasek at burnham.org
Fri Apr 3 03:15:41 UTC 2009

Hi, Diana:

Looks better.

I think you need to point out how evolutionary trees are used outside of 
'tree of life' applications (i.e. phylogenomics, phylogeography, gene 
function prediction, ...) as those are import applications for which 
phyloXML has been designed for.

Also, it is not expected that _you_ benchmark various XML parsers. It's 
good enough to rely on published results. The important point is that 
you, together with the BioRuby community, determine which one integrates 
best with BioRuby (i.e. ideally create no additional dependencies) and 
still provides acceptable performance.


Diana Jaunzeikare wrote:
> Hi,
> I posted Abstract and new project plan.
> http://socghop.appspot.com/student_proposal/show/google/gsoc2009/dianaj/t123872262150
> Diana
> On Thu, Apr 2, 2009 at 7:01 PM, Christian M Zmasek 
> <czmasek at burnham.org <mailto:czmasek at burnham.org>> wrote:
>     Diane:
>     Thank you for your interest in this project!
>     Indeed, the hour is late, and your proposal still needs
>     significant work in order to be competitive.
>     I think you already got some comments from Hilmar (I am at work
>     and I cannot use IRC).
>     Beside those, I'd like to suggest:
>     1. please make sure that all the studying is done before the
>     coding begins (May 23) (i.e. your "week 1" should be during the
>     "community bonding period").
>     2. You do not need to develop classes for objects already present
>     in BioRuby (such as phylogenetic trees). Actually, the less new
>     classes you have to introduce to better -- reuse!
>     3. I am happy to see that you include unit test early on, this is
>     good! You have to make extensive use of BioRuby's test suite.
>     4. In general, your weekly goals are not described in enough
>     detail. It might be a good idea to discuss goals, deliverables,
>     anticipated problems/difficulties (and possible solutions) for
>     each week.
>     5. Documentation is very important(!)
>     6. Do you plan to maintain the code after the summer?
>     Since the abstract is due April 3rd and cannot be changed after
>     that it is best to concentrate on the abstract first, though.
>     (The project plan can still be tweaked after April 3rd, I understand.)
>     In the abstract you have to make clear that you understand the
>     _biology_ behind the project. Why does phyloXML have the elements
>     it has? Why is it useful? Might have a look at:
>     http://www.phyloxml.org and
>     http://www.tdwg.org/proceedings/article/view/437.
>     Can you show that you understand what evolutionary trees are?
>     Where and why are they used?
>     Are they only important in 'tree of life' applications (e.g. see:
>     http://www.liebertonline.com/doi/pdf/10.1089/omi.2006.10.231)
>     What is "phylogenomics", what is "comparative genomics"?
>     You could also go to
>     [http://monochrome-effect.net/publications.html] and have a quick
>     look at some of the papers there, most are related to the issues
>     at hand, and some show real world applications of phylogenetic trees.
>     How might participating in this project help your career? What do
>     plan to learn? Why are you a good candidate for this?
>     Hope this helps some,
>     CZ
>     Diana Jaunzeikare wrote:
>         Hi everybody,
>         I know this is kinda late and I should have contacted you
>         earlier, but better later than never. I found out about
>         Phyloinformatics Summer of Code just last night when I was
>         doing homework for Bioinformatics lab on Phylogenetic
>         reconstruction and Parsimony. I was reading various bio-tech
>         related blogs in Google reader and in the Google Top
>         Recomendations bar I saw the blog of The Tree of Life. There I
>         read the blog post about Phyloinformatics Summer of Code. This
>         was very exciting news for me! I almost jumped off the sofa of
>         the excitement :) Already for two years I wanted to
>         participate in Google summer of code, but it never really fit
>         with what i was doing at the time. When I saw the project
>         about integrating phyloXML with BioRuby I knew it was for me!
>         I am a big fun of Ruby! Last semester for my Computational
>         Biology seminar I wrote bunch of scripts to deal with PDB
>         database for my final project. Also it has been pleasure to
>         develop in Ruby on Rails. What is even more exciting, is that
>         my research interests lie in Bioinformatics. In fact, I had
>         thoughts before to develop for BioRuby, but I didn't have a
>         good enough reason before.
>         Here is my project plan for building support for phyloXML for
>         BioRuby. I think the emphasis should be on the ease of use for
>         biologists and a lot of example code.
>         Week1 :
>          * Get familiar with BioRuby, its structure, classes (like
>         Bio:Tree), coding conventions, documentation conventions. See
>         other implementations of XML parsers in BioRuby (like BLAST XML).
>          * Get familiar with phyloXML, its structure, typical uses.
>         Get data set of many different files in phyloXML format for
>         testing.
>          Week 2:
>          * Try to write a program which would use phyloXMl data in
>         order to understand what would be the easiest way to use it.
>         (Later will be used for unit testing).
>          * Desing the architecture of phyloXML class, parser and
>         writer, interface with other classes (like alignment class).
>          Week 3: Develop the some/basic/the most essential objects of
>         phyloXML (Phylogeny, Clade, Taxonomy).
>          Weeks 4-5: Develop phyloXML parser.
>          Weeks 6-7: Develop phyloXML writer.
>          Week 8: Develop the rest of the objects of phyloXML.
>          Week 9: Update parser.
>         Week 10: Update writer.  Week 11: Finish up documentation.
>         Write extensive examples of how to use the code.
>         Week 12: Do write up of the project.
>         What do you think about the project plan? Anything missing?
>         Thanks,
>         Diana
>         Diana Jaunzeikare
>         Smith College
>         Computer Science and Math double major '10
>         CS Department Liaison and Master Tutor
>         email: djaunzei at email.smith.edu
>         <mailto:djaunzei at email.smith.edu>
>         <mailto:djaunzei at email.smith.edu
>         <mailto:djaunzei at email.smith.edu>>
>         cell: 413-387-2083

More information about the BioRuby mailing list