[Biojava-dev] [GSoC] Project Proposal

Nirmal Fernando nirmal070125 at gmail.com
Wed Apr 6 16:58:10 UTC 2011


Hi Peter,

On Wed, Apr 6, 2011 at 9:28 PM, Peter Troshin <p.v.troshin at dundee.ac.uk>wrote:

> Hi Nirmal,
>
> Thanks for improving your proposal.
> Yes, this seems useful although it may be a little out of scope for this
> project. I think that calculating some other useful property of the
> peptide/protein or nucleic acid would have been a better fit.
>
>
I see! What you think about following calculations:


   - Calculate volume of an amino acid sequence
   - Calculate amino acid composition: eg: ACSGGS
   - Alanine A 16.67%
      - Cysteine C 16.67%
      - Glycine G 33.33%
      - Serine S 33.33%
      - Calculate atomic composition of a protein
   - Sequence word count: count the number of occurrences of a word in
   sequence.
   -

      Sequence word count('GCTATAACGTATATATAT','TATA') = 3

      - Count n-mers in a nucleotide sequence eg: AAGT
      - dimer counts: AA - 1, AG- 1, GT- 1 & all others 0

Would these be too simple?

Thanks.

Regards,
> Peter
>
>
>
>
> On 06/04/2011 15:57, Nirmal Fernando wrote:
>
>> Hi,
>>
>> In addition to the functionalities provided in my proposal, I would like
>> to build a tool like http://gcua.schoedl.de/ which will be used to
>> display the codon quality in codon usage frequency values.
>>
>> It would be nice to get the feedback of the community regarding the
>> importance of a tool like this to BioJava3.
>>
>> Thanks.
>>
>>
>> On Tue, Apr 5, 2011 at 9:34 PM, Nirmal Fernando <nirmal070125 at gmail.com<mailto:
>> nirmal070125 at gmail.com>> wrote:
>>
>>    Hi Peter,
>>
>>    On Tue, Apr 5, 2011 at 9:18 PM, Peter Troshin
>>    <p.v.troshin at dundee.ac.uk <mailto:p.v.troshin at dundee.ac.uk>> wrote:
>>    >
>>    > Hi Nirmal,
>>    >
>>    > First of all thanks for the proposal it looks good.
>>    > However, I think that one of the benefits of my project idea is
>>    that it lets you implement a few other methods that are of
>>    interest to >you. It is a pity that you did not use this
>>    opportunity. I strongly encourage to use your knowledge and to
>>    look at the other properties >that you can implement for the
>>    benefit of the community. Otherwise it looks like you are not
>>    terribly interested in Bioinformatics!
>>
>>    Sorry for the disappointment! This week is a bit busy week with me
>>    having few events at the University, that's why I didn't get much time
>>    to look for other methods. But I'll try my best to research and
>>    propose some other methods which will benefit the community.
>>
>>    >
>>    > Also, I think that the best method of learning BioJava is trying
>>    it. So I'd put in the project plan that you will write test cases
>>    to check >out the parts of BioJava that you will be using. Apart
>>    from helping you learning it in depth it will also help to ensure
>>    that the BioJava >code behaves.
>>
>>    Thanks for the tip! :)
>>
>>    >
>>    > Regards,
>>    > Peter
>>    >
>>    > On 05/04/2011 14:40, Nirmal Fernando wrote:
>>    >>
>>    >> Hi All,
>>    >>
>>    >> I have prepared my GSoC proposal for BioJava [1]. I highly
>>    appreciate your valuable feedback.
>>    >>
>>    >> Thanks.
>>    >>
>>    >> [1]
>>    >>
>>    >>
>>    >>  Google Summer of Code 2011 - Project Proposal
>>    >>
>>    >> Organization
>>    >>
>>    >>
>>    >>
>>    >> *Open Bioinformatics Foundation- BioJava*
>>    >>
>>    >> Project
>>    >>
>>    >>
>>    >>
>>    >> *Calculation of Physicochemical Properties of Amino Acids*
>>    >>
>>    >> Student Name
>>    >>
>>    >>
>>    >>
>>    >> C. S. Nirmal J. Fernando.
>>    >>
>>    >> E-mail
>>    >>
>>    >>
>>    >>
>>    >> nirmal070125 at gmail.com <mailto:nirmal070125 at gmail.com>
>>    <mailto:nirmal070125 at gmail.com <mailto:nirmal070125 at gmail.com>>
>>
>>    >>
>>    >> IM
>>    >>
>>    >>
>>    >>
>>    >> nirmal070125 (Google Talk)
>>    >>
>>    >> nirmalfdo (IRC – freenode.net <http://freenode.net>
>>    <http://freenode.net>)
>>    >>
>>    >> Address
>>    >>
>>    >>
>>    >>
>>    >> 47, Keels Housing Scheme, Pinwatte, Panadura, Sri Lanka.
>>    >>
>>    >> Mobile No.
>>    >>
>>    >>
>>    >>
>>    >> +94715779733
>>    >>
>>    >> *Why I am interested?*
>>    >>
>>    >> *
>>    >> *
>>    >>
>>    >> I have recently finished a course module on Bio Informatics and
>>    have a basic
>>    >> understanding about bio informatics related algorithms which
>>    made me interested in this area of computer science.
>>    >>
>>    >> *
>>    >> *
>>    >>
>>    >> *Why I am well-suited?*
>>    >>
>>    >>
>>    >> I participated in GSoC 2010 for Apache Derby (RDBMS in Java)
>>    project and successfully finished the project. The sounding Java
>>    knowledge, algorithmic knowledge on bio informatics and the
>>    experiences of concurrent programming make me more comfortable and
>>    matching.
>>    >>
>>    >>
>>    >> “Nirmal joined the Apache Derby community as a *Google Summer
>>    of Code *student for the summer of 2010. In this role, Nirmal
>>    wrote a very useful tool called PlanExporter. This tool will help
>>    users of the *Apache Derby *database understand and fix
>>    performance issues in their data-rich applications. Nirmal fit
>>    well into our open-source community, collaborating with other
>>    engineers, proceeding incrementally, and seeking and taking advice
>>    cheerfully. Nirmal's contributions to Apache Derby are highly
>>    respected.”-//*Richard Hillegas*
>>    <
>> http://www.linkedin.com/profile/view?id=31609548&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1
>>    <
>> http://www.linkedin.com/profile/view?id=31609548&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1
>> >>*,
>>    /Senior Software Engineer, Sun Microsystems./*
>>    >>
>>    >>
>>    >> “Nirmal's work on the Derby PlanExporter tool as part of the
>>    Google 2010 Summer of Code was clear, well-executed and
>>    successful. Furthermore, every member of the Derby team that I've
>>    spoken to has been pleased with Nirmal's contributions to the
>>    community and we look forward to having Nirmal continuing to work
>>    with Derby in the future.”- *Bryan Pendleton*
>>    <
>> http://www.linkedin.com/profile/view?id=492758&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1
>>    <
>> http://www.linkedin.com/profile/view?id=492758&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1
>> >>*,
>>    /Committer, Apache Derby/.*
>>    >>
>>    >> *
>>    >> *
>>    >>
>>    >> *Programming Experiences and Skills*
>>    >>
>>    >> ·Completed the “short coding exercise” (all three goals) given
>>    by the mentor.
>>    >>
>>    >> ·Final Year Project: SeMap is our final year project and a four
>>    member one which is led by me. Objective is to develop a superior
>>    framework for mapping English Language Semantic Dependency
>>    Relationships to sets of semantic frames with reasonable accuracy
>>    for complex sentences with an integrated statistical linguistics
>>    based artificial intelligence component to allow automatic
>>    extensibility.We are working under OpenCog.org, a FOSS foundation,
>>    under the supervision of Dr. Ben Goertzel. Technologies: [Java,
>>    Drools]
>>    >>
>>    >> *Contributions to Open Source world*
>>    >>
>>    >> *
>>    >> *
>>    >>
>>    >> ·Implemented PlanExporter tool which allows Apache Derby users
>>    to view and understand the query plan followed by the optimizer.
>>    Technologies: [Java, XML, XSLT, HTML and CSS] (Google Summer of
>>    Code – 2010 project)
>>    >>
>>    >> ·Solved many issues in Apache Derby
>>    https://issues.apache.org/jira/secure/IssueNavigator.jspa?
>>    >>
>>    >> ·Continuing to work on Apache Derby even after the summer of code.
>>    >>
>>    >> **
>>    >>
>>    >> *Project Rationale*
>>    >>
>>    >> *
>>    >> *
>>    >>
>>    >> The calculation of simple physicochemical properties for
>>    biopolymers is an important tool in the arsenal of molecular
>>    biologist. Theoretically calculated quantities like extinction
>>    coefficients, isoelectric points, hydrophobicities and instability
>>    indices are useful guides as to how a molecule behaves in an
>>    experiment. Many tools for calculating these properties exist,
>>    including widely used open-source implementations in EMBOSS and
>>    BioPerl, but only some are currently available in BioJava3. The
>>    aim of this project is to port or produce new implementations of
>>    standard algorithms for a range of calculations within BioJava3.
>>    >>
>>    >> *Project Scope *
>>    >>
>>    >> *
>>    >> *
>>    >>
>>    >> Primarily focus on developing following functionalities:
>>    >>
>>    >>   1. Finding molecular weight of a sequence
>>    >>   2. Finding extinction coefficient of a protein
>>    >>   3. Finding instability index of a protein
>>    >>   4. Finding aliphatic index of a protein
>>    >>   5. Finding GRAVY (Grand Average of Hydropathy) value of a peptide
>>    >>      or a protein
>>    >>   6. Finding isoelectric point of a sequence
>>    >>   7. Finding number of amino acids in a protein (His, Met, Cys)
>>    >>
>>    >> **
>>    >>
>>    >> *Project Plan*
>>    >>
>>    >> *April 20 - May 10*
>>    >>
>>    >>    * Read on BioJava3 design
>>    http://biojava.org/wiki/BioJava3_Design
>>    >>    * Read on BioJava3 data model
>>    >> http://www.biojava.org/wiki/BioJava3_Proposal
>>    >>    * Get an understanding on how each BioJava3 module works and
>>    their
>>    >>      functionalities.
>>    >>    * Find and read on algorithms to provide above mentioned
>>    >>      functionalities.
>>    >>    * Identify the possibility of using methods and tools in
>>    BioJava3
>>    >>
>>    >> *May**11 - May 24*
>>    >>
>>    >>    * Implement functions to calculate molecular weight of a
>>    sequence
>>    >>      and extinction coefficient of a protein using multi threads
>>    >>      where it is possible.
>>    >>    * Implement functional test cases using Junit.
>>    >>    * Develop a high level documentation for end users.
>>    >>
>>    >> *May 24 - July 10*
>>    >>
>>    >>    * Preparing for the mid-term evaluation of the project.
>>    >>
>>    >> *
>>    >> *
>>    >>
>>    >> *July 12 - August 15*
>>    >>
>>    >>    * Implement functions to calculate,
>>    >>
>>    >>          o Instability index of a protein
>>    >>          o Aliphatic index of a protein
>>    >>          o GRAVY (Grand Average of Hydropathy) value for a
>>    peptide or
>>    >>            a protein
>>    >>          o Isoelectric point of a sequence
>>    >>
>>    >>          o number of amino acids in a protein (His, Met, Cys)
>>    >>
>>    >> ; using multi threads where it is possible.
>>    >>
>>    >>    * Implement functional test cases using Junit.
>>    >>    * Update the high level documentation for end users.
>>    >>
>>    >> *August 16 - August 22*
>>    >>
>>    >>    * Wrap up the work done, and polishing up the code.
>>    >>    * Creating Java-doc API
>>    >>    * Preparing for the final evaluation.
>>    >>
>>    >> *August 26*
>>    >>
>>    >>    * Final evaluation deadline.
>>    >>
>>    >> *Project Deliverables*
>>    >>
>>    >> ·Java library with above mentioned functionalities.
>>    >>
>>    >> ·Command line executables.
>>    >>
>>    >> ·Java doc API of the library.
>>    >>
>>    >> ·Functional test cases.
>>    >>
>>    >> ·High level end user documentation
>>    >>
>>    >>
>>    >> --
>>    >> Best Regards,
>>    >> Nirmal
>>    >>
>>    >> C.S.Nirmal J. Fernando
>>    >> Department of Computer Science & Engineering,
>>    >> Faculty of Engineering,
>>    >> University of Moratuwa,
>>    >> Sri Lanka.
>>    >>
>>    >> Blog: http://nirmalfdo.blogspot.com/
>>    >>
>>    >
>>    >
>>
>>
>>
>>    --
>>    Best Regards,
>>    Nirmal
>>
>>    C.S.Nirmal J. Fernando
>>    Department of Computer Science & Engineering,
>>    Faculty of Engineering,
>>    University of Moratuwa,
>>    Sri Lanka.
>>    Blog: http://nirmalfdo.blogspot.com/
>>
>>
>>
>>
>> --
>> Best Regards,
>> Nirmal
>>
>> C.S.Nirmal J. Fernando
>> Department of Computer Science & Engineering,
>> Faculty of Engineering,
>> University of Moratuwa,
>> Sri Lanka.
>>
>> Blog: http://nirmalfdo.blogspot.com/
>>
>>
>


-- 
Best Regards,
Nirmal

C.S.Nirmal J. Fernando
Department of Computer Science & Engineering,
Faculty of Engineering,
University of Moratuwa,
Sri Lanka.

Blog: http://nirmalfdo.blogspot.com/




More information about the biojava-dev mailing list