[BioRuby] GSoC question (regarding SDI algorithm)

Thu Apr 8 21:39:44 UTC 2010

Hi, Monika:

Remember, the deadline is tomorrow!

> Hello
> 
> I think I should first explicitly ask this question: I am not 
> experienced in Ruby, but I believe I can improve myself enough during 
> the Community Bonding period. Does it make me ineligible to apply?

No. Part of the goals for GSoC is for students to "learn new things".

> If not, please continue with reading: ;)
>  
> 
>     Basically you also need to write a short CV, similar to a job
>     application.
> 
> Where should I later submit this CV? Should include only education / 
> work experience, or also something like a cover letter?

It will all be part of you application (i.e. one document).
You need to write an "abstract" which can can be considered a cover letter.

> 
> And a question about the algorithm itself - would it need to be 
> accomodated in some application, or just be a separate BioRuby library?

It would be part of BioRuby.

> 
>     Develop unit tests
> 
> I hope this is not stupid question, but where is it possible to get the 
> following information about a gene tree: which nodes should receive 
> which annotations about 'duplication' and 'specialization', for the 
> duplication inference to be considered correct? I mean, as a 
> non-biologist I do not know what should be the correct output of the 
> algorithm...

You should read (or at least have a look at) some of the references 
listed here:
http://evogsoc2010.wordpress.com/2010/03/25/references-for-gene-duplications-proposal/

You can also have a look at my PhD thesis, which explains some of the 
background, especially chapter 1.3.2.1.
See: 
ftp://selab.janelia.org/pub/publications/Zmasek02/Zmasek02-phdthesis.pdf

Furthermore, I can easily provide you with test gene trees which have 
duplications assigned. This is not a big issue.

> 
> Regards
> Monika Machunik
> 
> 
> 2010/4/6 Christian M Zmasek <czmasek at burnham.org 
> <mailto:czmasek at burnham.org>>
> 
>     Hi, Monika:
> 
>     Thank you for you interest in this proposal.
>     Please remember that student applications are due by April 9, 19:00
>     UTC -- so, you have not much time left.
> 
>     I think your lack of experience in Biology is not a problem.
> 
>     The idea is to implement the algorithm with the BioRuby toolkit
>     (http://www.bioruby.org/).
> 
>     Some more advice:
> 
>     If you plan to apply, you need to write a very detailed plan on how
>     you intend to accomplish this project.
> 
>     For each step you should list:
>     1. Goal/deliverable
>     2. Approach
>     3. Time estimation
>     4. Anticipated problems & possible alternative approaches
> 
>     Like so:
> 
>     A. Prior to coding (from ... to .... )
>        1. Familiarize myself with BioRuby, set up git hub repository
>        2. ...
>        3. 1 week
>        4. Not familiar with git, might need to...
> 
>     B. Week 1 (from ... to .... )
>        1. Develop unit tests
>        2. Using manually created gene and species trees, I plan to...
>        3. 1 week
>        4. No problem anticipated
> 
> 
>     Basically you also need to write a short CV, similar to a job
>     application.
> 
>     Hope this helps,
> 
>     Christian
> 
> 
>     Monika Machunik wrote:
> 
>         Hello
> 
>         My name is Monika Machunik and I am planing to apply in this
>         year's Summer
>         of Code. I have read your idea description about "Implementation of
>         algorithm to infer gene duplications in BioRuby", and, although my
>         background does not include any biology, I got quite interested
>         in this
>         project (I could not find mentors' email addresses, so I'm
>         posting it
>         here..).
> 
>         I would like to shortly introduce myself to get your opinion if
>         I would be
>         suitable for this project.
> 
>         I have about a year of work experience in Java programming,
>         including some
>         internships and last year's GSoC. Besides Java I know C++, some
>         C, Php,
>         HTML, etc. I am not experienced in Ruby programming (at least
>         have seen Ruby
>         code;)), but I learn fast. Currently I am doing my Master degree
>         in Computer
>         Science, so I have some knowlegde about algorithms and data
>         structures. I
>         have never worked at the intersection of biology and CS, but this
>         conjunction has always been intriguing to me.
> 
>         And now my thoughts about possible content of the workload.
> 
>         I have read the abstract of the article and, despite of my lack of
>         biological knowledge, I managed to comprehend it;). I think I
>         also should
>         have no problem with understanding the algorithm itself. Apart from
>         implementing the algorithm, the project would involve getting
>         familiar with
>         BioRuby, understanding phyloXML in such extent to be able to
>         write an
>         algorithm operating on its ready structures.
> 
>         I am not sure if the algorithm should be implemented inside some
>         exisitng
>         software, or will it be a kind of standalone algorithm? If it
>         should be
>         accomodated inside some application, the project would probably
>         involve
>         doing that too...
> 
>         ...let it be all for now. Let me know if I have any chances in
>         this project
>         :)
> 
>         Best regards
>         Monika Machunik
>         _______________________________________________
>         BioRuby Project - http://www.bioruby.org/
>         BioRuby mailing list
>         BioRuby at lists.open-bio.org <mailto:BioRuby at lists.open-bio.org>
>         http://lists.open-bio.org/mailman/listinfo/bioruby
> 
> 
>