[Biojava-dev] accepted GSoC projects

Spencer Bliven quantum7 at gmail.com
Wed Apr 28 19:06:40 UTC 2010


Mark-

Welcome to the Biojava community! Adding multiple sequence alignments will
be a nice feature for the library.

One suggestion I have is to make any data structures for multiple alignments
you create as general as possible, and to think about whether the special
cases can still be represented. For instance, can you store an alignment
where some of the sequence is unknown (eg {ABCD, ABXD})? Can you store an
alignment where only a subset of the sequences are defined? I recently had
to represent an alignment like this:
ABCD EFGH
EFGH ABCD
This sort of alignment can't be written using just gaps; I had to make a new
structure to store pairs {(A,A), (B,B), ...} and rewrite much of the
existing alignment functionality based on that.

Anyway, I don't mean to get bogged down in specific examples or exceptions.
I just wanted to point out that there are a lot of methods which can be used
to define some sort of alignment between a set of sequences, and it would be
nice if the BioJava alignment package was general enough to accommodate such
methods in the future without reinventing the wheel.

Cheers!
Spencer

P.S. I ran into such weird alignments while working on structural
alignments, which are not well behaved like traditional multiple sequence
alignments. Andreas knows all about both types of alignment, and can
probably judge better than I how much generality is worth spending your time
on.



On Tue, Apr 27, 2010 at 9:18 PM, Mark Chapman <chapman at cs.wisc.edu> wrote:

> Hi all,
>
> Thank you to Google, Open Bioinformatics Foundation, BioJava, and my
> mentors for this opportunity.  As a short introduction, I am Mark Chapman, a
> graduate student in Computer Sciences at the University of Wisconsin -
> Madison.  My focus is in artificial intelligence and bioinformatics.  This
> summer, I will add a Multiple Sequence Alignment module to BioJava.
>
> My first task will be to update the alignment module to BioJava3 and to
> design the interface for MSA.  My second goal is to implement a progressive
> MSA styled after clustalw.  After that, I will add alternative routines for
> each step.
>
> Any ideas for the MSA project as well as more sources of programming wisdom
> are quite welcome.  For example, Andreas suggested a series about Java
> parallelism and lazy execution (
> http://apocalisp.wordpress.com/2008/06/18/parallel-strategies-and-the-callable-monad/).
>  I also noted a useful tip for iterative development (
> http://en.flossmanuals.net/GSoCMentoring/Workflow).
>
> Thanks again,
> Mark
>
>
>
> On 4/27/2010 12:33 AM, Andreas Prlic wrote:
>
>> Dear all,
>>
>> Google has released the results for GSoC: Congratulations to Mark
>> Chapman and Jianjiong Gao for having been accepted to work on the MSA
>> and PTM projects for BioJava! Let's start the "community bonding"
>> process ( http://en.flossmanuals.net/GSoCMentoring/MindtheGap )  and we
>> all are looking forward to work with you on this during the summer. The
>> Mentors and co-mentors will be Peter Rose for the PTM and Scooter Willis
>> and Kyle Ellrott for the MSA project (and me).
>>
>> I want to thank all of of you who submitted proposals or showed interest
>> in other ways for the Google Summer of Code. We hope you are not too
>> disappointed if your application did not get accepted this time. We had
>> a  large number (52) applications and the the overall quality of the
>> submissions was very high. We would like to stay in touch with you and
>> we hope that you are interested in BioJava also beyond the scope of
>> GSoC. There are a number of different ways how to contribute:  We are
>> always looking for people who provide code and patches to further
>> improve our library, help out with the documentation on the Wiki page,
>> or answer questions on the mailing lists.
>>
>> Let's all give Mark and Jianjiong  a warm welcome to the BioJava
>> community.  For those of you who are interested in following the
>> progress of the projects, as usually, the development related
>> discussions are going to be on the biojava-dev list.
>>
>> Happy coding!
>>
>> Andreas
>>
>>
>>  _______________________________________________
> biojava-dev mailing list
> biojava-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>



More information about the biojava-dev mailing list