[GSoC] Interested in working on SearchIO

Peter Cock p.j.a.cock at googlemail.com
Fri Mar 23 09:30:10 UTC 2012


On Thu, Mar 22, 2012 at 6:03 PM, Ayush Goel <ayushgoel111 at gmail.com> wrote:
> Hello,
>
>  I am a student at Delhi College of Engineering. I have a prior
> experience in python at two other interns. I was hoping to find myself
> a more challenging project this time with python as the default
> language. The description of the SearchIO project seems to be a very
> good one.
>
>  Still I am pretty new to the biopython's code. If possible, I would
> like to have some more information regarding what is expected from the
> deliverable. Also if some reference material on the background of the
> data formats required (BLAST etc) could be provided, then it would be
> very helpful.

Hello Ayush,

Are you doing any biology or bioinformatics courses? That would
help with background knowledge.

The SearchIO project does require a reasonably broad knowledge
of important tools and concepts in pairwise sequence alignment -
if you not familiar with BLAST etc that will be a big handicap. You
don't need to know the algorithm details - just the overall idea,
and how to run the tools and what kind of analysis people might
want to do with it. Some possible background reading (an
introductory Bioinformatics course or book might be good too):

http://www.ncbi.nlm.nih.gov/BLAST/
http://en.wikipedia.org/wiki/BLAST

http://emboss.open-bio.org/wiki/Appdoc:Needle
http://en.wikipedia.org/wiki/Needleman-Wunsch_algorithm

http://emboss.open-bio.org/wiki/Appdoc:Water
http://en.wikipedia.org/wiki/Smith-Waterman_algorithm

In terms of possible deliverables, I went into more detail here:
http://lists.open-bio.org/pipermail/biopython-dev/2012-March/009468.html

However, if you have a lot of experience with Python and parsing
text and XML files, that would be a big plus. Perhaps there is
another topic that might suit you better. Is there a particular
reason why you are interested in Biopython?

Regards,

Peter




More information about the GSoC mailing list