[GSoC] GSoC 2013 is ON

Tue Apr 2 08:36:52 UTC 2013

Great idea. Can you format it as a project idea on your Wiki - take
the other ideas as an example. You can leave the two options open and
see what student reacts to.

BTW We only have the 'right' students :). The selection process is
pretty strong.

There is some ongoing discussion about core and non-core OBF projects,
but I think since BioHaskell is going strong there will be little
against adding one project idea. If it brings a really competent
student we all gain.

There is a meeting between GSoC and the OBF board on the 9th, right
after we get or do not get accepted as a mentoring organisation. I'll
make a case for inclusion of your project idea into the program.

Adding it to the wiki will help.

Pj.

On Tue, Apr 02, 2013 at 10:03:58AM +0200, Ketil Malde wrote:
> 
> [CC everybody including the biohaskell list. Let me know if any of you
> want off. :-) ]
> 
> Pjotr Prins <pjotr2010 at thebird.nl> writes:
> 
> >   http://www.open-bio.org/wiki/Google_Summer_of_Code
> 
> > For Biopython (3x), BioRuby (5x) and BioJava (4x) I found project ideas.
> 
> > The others are missing.
> 
> > There is still a (rather small) window of opportunity for adding
> > ideas.
> 
> I have one thing that might work well as a SOC project, if the right
> student could be found.
> 
> Basically, I and a colleague recently developed and published a method
> and implementation for more sensitive pairwise alignments.  The paper is
> here, I think (PLoS ONE seems to be down atm):
>   http://dx.plos.org/10.1371/journal.pone.0054422
> 
> I'm really happy about the results, if nothing else, check the SCOP
> benchmark.  Although it's difficult to construct a good test case using
> more complex methods (training sets for HMMs and whatnot) I don't know
> anything that is as good as this.  We're using it for annotation of
> genes.
> 
> The current implementation is in Haskell, and although it works
> correctly, it is a bit slow, and more problematic, it consumes too much
> memory (so going multi-threaded, although pretty easy, won't be of any
> help).
> 
> I would like to make this into a less resource intensive (and thus more
> practical) tool, and there are two ways I can think of to go about this:
> 
> 1) Optimize the Haskell program
> 
> 2) Reimplement the algorithm (or parts of it) in a different language
> 
> Advantages of 1:
> 
> * Already have a working program, and the type system makes it easy to
> refactor without introducing errors.
> * Haskell supports lots of good multi-threading programming models (like
> STM)
> * I know Haskell pretty well, and will be hopefully be able to mentor.
> 
> Disadvantages:
> 
> * Haskell has some good debugging tools, but they tend to work really
>   poorly for large memory (i.e. it takes a long time to generate
>   profiles)
> * Needs somebody with a bit (or a lot) of experience optimizing Haskell,
>   and good knowledge of high-perf libraries (like vector)
> 
> Advantages of 2:
> 
> * Easier to get a student with adequate skills.
> * More predictable performance models in other languages.
> * Easier to compile and install for many users.
> 
> Disadvantages:
> 
> * Ideally, should know enough Haskell to read and understand the code.
> * Likely needs a co-mentor with knowledge of the language in question.
> 
> Is this something I could or should submit as a task?
> 
> -k
> -- 
> If I haven't seen further, it is by standing in the footprints of giants