[Biojava-l] Questions about Summer of Code Project

Singer Ma sma.hmc at gmail.com
Wed Apr 7 07:52:34 UTC 2010


I had previously sent this, but was not part of the mailing list, so I
can only assume it got lost in a spam loop.

I was interested in applying for the All-Java Multiple Sequence
Alignment Google Summer of Code project. I wanted to create a project
plan but had some questions about the package as it stands now.

1. What exactly has changed with the transition to BioJava 3? From
what I've read on the BioJava 3 proposal page, it seems like that the
changes are to the organization of the code. Additionally there are
some new standards to follow. Java 6 usage is desired, but I am unsure
of what of the new features could be used in modifying pairwise
sequence alignments.

2. Is the Neighbor Joining Algorithm really the best for this? Are
other multiple alignments implementations desired? I have implemented
the neighbor joining algorithm very inefficiently in python, it was
not particularly difficult. This step seems like it will not take very
long. Additionally, parallelism, I have no experience with parallelism
in Java and will only have some experience with it in C, will that be
an issue?

3. Is there a specific paper with the exact algorithm that should be
implemented here?

General: Will use cases be provided? Will test data be provided? These
would both be useful in coding the test cases which seem to be coded
first.

Additionally, I have access to my current windows machine as well as
as Linux machine for testing, but no Mac. While in theory with java,
if it works on one, then it works on another, and especially with if
it works on Linux, it should be fine on Mac, should I be worried about
strange peculiarities?

Thanks,
Singer Ma
Harvey Mudd College 2011



More information about the Biojava-l mailing list