[Biojava-dev] [GSoC] Project Proposal

Nirmal Fernando nirmal070125 at gmail.com
Tue Apr 5 13:40:07 UTC 2011


Hi All,

I have prepared my GSoC proposal for BioJava [1]. I highly appreciate your
valuable feedback.

Thanks.

[1]
Google Summer of Code 2011 - Project Proposal



Organization

*Open Bioinformatics Foundation- BioJava*

Project

*Calculation of Physicochemical Properties of Amino Acids*

Student Name

C. S. Nirmal J. Fernando.

E-mail

nirmal070125 at gmail.com

IM

nirmal070125 (Google Talk)

nirmalfdo (IRC – freenode.net)

Address

47, Keels Housing Scheme, Pinwatte, Panadura, Sri Lanka.

Mobile No.

+94715779733



*Why I am interested?*

*
*

I have recently finished a course module on Bio Informatics and have a basic
understanding about bio informatics related algorithms which made me
interested in this area of computer science.

*
*

*Why I am well-suited?*


I participated in GSoC 2010 for Apache Derby (RDBMS in Java) project and
successfully finished the project. The sounding Java knowledge, algorithmic
knowledge on bio informatics and the experiences of concurrent programming
make me more comfortable and matching.


“Nirmal joined the Apache Derby community as a *Google Summer of Code *student
for the summer of 2010. In this role, Nirmal wrote a very useful tool called
PlanExporter. This tool will help users of the *Apache Derby *database
understand and fix performance issues in their data-rich applications.
Nirmal fit well into our open-source community, collaborating with other
engineers, proceeding incrementally, and seeking and taking advice
cheerfully. Nirmal's contributions to Apache Derby are highly respected.”-*
**Richard Hillegas*<http://www.linkedin.com/profile/view?id=31609548&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1>
*, Senior Software Engineer, Sun Microsystems.*


“Nirmal's work on the Derby PlanExporter tool as part of the Google 2010
Summer of Code was clear, well-executed and successful. Furthermore, every
member of the Derby team that I've spoken to has been pleased with Nirmal's
contributions to the community and we look forward to having Nirmal
continuing to work with Derby in the future.”- *Bryan
Pendleton*<http://www.linkedin.com/profile/view?id=492758&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1>
*, Committer, Apache Derby.*

*
*

*Programming Experiences and Skills*

·         Completed the “short coding exercise” (all three goals) given by
the mentor.**

·         Final Year Project: SeMap is our final year project and a four
member one which is led by me. Objective is to develop a superior framework
for mapping English Language Semantic Dependency Relationships to sets of
semantic frames with reasonable accuracy for complex sentences with an
integrated statistical linguistics based artificial intelligence component
to allow automatic extensibility. We are working under OpenCog.org, a FOSS
foundation, under the supervision of Dr. Ben Goertzel. Technologies: [Java,
Drools]



*Contributions to Open Source world*

*
*

·         Implemented PlanExporter tool which allows Apache Derby users to
view and understand the query plan followed by the optimizer. Technologies:
[Java, XML, XSLT, HTML and CSS] (Google Summer of Code – 2010 project)

·         Solved many issues in Apache Derby
https://issues.apache.org/jira/secure/IssueNavigator.jspa?

·         Continuing to work on Apache Derby even after the summer of code.

* *

*Project Rationale*

*
*

The calculation of simple physicochemical properties for biopolymers is an
important tool in the arsenal of molecular biologist. Theoretically
calculated quantities like extinction coefficients, isoelectric points,
hydrophobicities and instability indices are useful guides as to how a
molecule behaves in an experiment. Many tools for calculating these
properties exist, including widely used open-source implementations in
EMBOSS and BioPerl, but only some are currently available in BioJava3. The
aim of this project is to port or produce new implementations of standard
algorithms for a range of calculations within BioJava3.



*Project Scope *

*
*

Primarily focus on developing following functionalities:

   1. Finding molecular weight of a sequence
   2. Finding extinction coefficient of a protein
   3. Finding instability index of a protein
   4. Finding aliphatic index of a protein
   5. Finding GRAVY (Grand Average of Hydropathy) value of a peptide or a
   protein
   6. Finding isoelectric point of a sequence
   7. Finding number of amino acids in a protein (His, Met, Cys)

* *

*Project Plan*

*April 20 - May 10*

   - Read on BioJava3 design http://biojava.org/wiki/BioJava3_Design
   - Read on BioJava3 data model
   http://www.biojava.org/wiki/BioJava3_Proposal
   - Get an understanding on how each BioJava3 module works and their
   functionalities.
   - Find and read on algorithms to provide above mentioned functionalities.
   - Identify the possibility of using methods and tools in BioJava3



*May* *11 - May 24*

   - Implement functions to calculate molecular weight of a sequence and
   extinction coefficient of a protein using multi threads where it is
   possible.
   - Implement functional test cases using Junit.
   - Develop a high level documentation for end users.



*May 24 - July 10*

   - Preparing for the mid-term evaluation of the project.

*
*

*July 12 - August 15*

   - Implement functions to calculate,


    - Instability index of a protein
      - Aliphatic index of a protein
      - GRAVY (Grand Average of Hydropathy) value for a peptide or a protein
      - Isoelectric point of a sequence


    - number of amino acids in a protein (His, Met, Cys)

                                                                 ; using
multi threads where it is possible.

   - Implement functional test cases using Junit.
   - Update the high level documentation for end users.



*August 16 - August 22*

   - Wrap up the work done, and polishing up the code.
   - Creating Java-doc API
   - Preparing for the final evaluation.



*August 26*

   - Final evaluation deadline.



*Project Deliverables*

·         Java library with above mentioned functionalities.

·         Command line executables.

·         Java doc API of the library.

·         Functional test cases.

·         High level end user documentation

-- 
Best Regards,
Nirmal

C.S.Nirmal J. Fernando
Department of Computer Science & Engineering,
Faculty of Engineering,
University of Moratuwa,
Sri Lanka.

Blog: http://nirmalfdo.blogspot.com/




More information about the biojava-dev mailing list