[Biojava-dev] [GSoC] Project Proposal

Peter Troshin p.v.troshin at dundee.ac.uk
Thu Apr 7 09:45:24 UTC 2011


Hi Nirmal,

Actually I think the best way to find out what else you can calculate is 
to ask the community. Explain the context (that you are GSoC student 
etc..) , the project, and see if anyone suggest something what would be 
a good fit for the project!

Regards,
Peter



On 06/04/2011 17:58, Nirmal Fernando wrote:
> Hi Peter,
>
> On Wed, Apr 6, 2011 at 9:28 PM, Peter Troshin 
> <p.v.troshin at dundee.ac.uk <mailto:p.v.troshin at dundee.ac.uk>> wrote:
>
>     Hi Nirmal,
>
>     Thanks for improving your proposal.
>     Yes, this seems useful although it may be a little out of scope
>     for this project. I think that calculating some other useful
>     property of the peptide/protein or nucleic acid would have been a
>     better fit.
>
>
> I see! What you think about following calculations:
>
>     * Calculate volume of an amino acid sequence
>     * Calculate amino acid composition: eg: ACSGGS
>           o Alanine A 16.67%
>           o Cysteine C 16.67%
>           o Glycine G 33.33%
>           o Serine S 33.33%
>     * Calculate atomic composition of a protein
>     * Sequence word count: count the number of occurrences of a word
>       in sequence.
>          o
>             Sequence word count('GCTATAACGTATATATAT','TATA') = 3
>     * Count n-mers in a nucleotide sequence eg: AAGT
>           o dimer counts: AA - 1, AG- 1, GT- 1 & all others 0
>
> Would these be too simple?
>
> Thanks.
>
>     Regards,
>     Peter
>
>
>
>
>     On 06/04/2011 15:57, Nirmal Fernando wrote:
>
>         Hi,
>
>         In addition to the functionalities provided in my proposal, I
>         would like to build a tool like http://gcua.schoedl.de/ which
>         will be used to display the codon quality in codon usage
>         frequency values.
>
>         It would be nice to get the feedback of the community
>         regarding the importance of a tool like this to BioJava3.
>
>         Thanks.
>
>
>         On Tue, Apr 5, 2011 at 9:34 PM, Nirmal Fernando
>         <nirmal070125 at gmail.com <mailto:nirmal070125 at gmail.com>
>         <mailto:nirmal070125 at gmail.com
>         <mailto:nirmal070125 at gmail.com>>> wrote:
>
>            Hi Peter,
>
>            On Tue, Apr 5, 2011 at 9:18 PM, Peter Troshin
>         <p.v.troshin at dundee.ac.uk <mailto:p.v.troshin at dundee.ac.uk>
>         <mailto:p.v.troshin at dundee.ac.uk
>         <mailto:p.v.troshin at dundee.ac.uk>>> wrote:
>         >
>         > Hi Nirmal,
>         >
>         > First of all thanks for the proposal it looks good.
>         > However, I think that one of the benefits of my project idea is
>            that it lets you implement a few other methods that are of
>            interest to >you. It is a pity that you did not use this
>            opportunity. I strongly encourage to use your knowledge and to
>            look at the other properties >that you can implement for the
>            benefit of the community. Otherwise it looks like you are not
>            terribly interested in Bioinformatics!
>
>            Sorry for the disappointment! This week is a bit busy week
>         with me
>            having few events at the University, that's why I didn't
>         get much time
>            to look for other methods. But I'll try my best to research and
>            propose some other methods which will benefit the community.
>
>         >
>         > Also, I think that the best method of learning BioJava is trying
>            it. So I'd put in the project plan that you will write test
>         cases
>            to check >out the parts of BioJava that you will be using.
>         Apart
>            from helping you learning it in depth it will also help to
>         ensure
>            that the BioJava >code behaves.
>
>            Thanks for the tip! :)
>
>         >
>         > Regards,
>         > Peter
>         >
>         > On 05/04/2011 14:40, Nirmal Fernando wrote:
>         >>
>         >> Hi All,
>         >>
>         >> I have prepared my GSoC proposal for BioJava [1]. I highly
>            appreciate your valuable feedback.
>         >>
>         >> Thanks.
>         >>
>         >> [1]
>         >>
>         >>
>         >>  Google Summer of Code 2011 - Project Proposal
>         >>
>         >> Organization
>         >>
>         >>
>         >>
>         >> *Open Bioinformatics Foundation- BioJava*
>         >>
>         >> Project
>         >>
>         >>
>         >>
>         >> *Calculation of Physicochemical Properties of Amino Acids*
>         >>
>         >> Student Name
>         >>
>         >>
>         >>
>         >> C. S. Nirmal J. Fernando.
>         >>
>         >> E-mail
>         >>
>         >>
>         >>
>         >> nirmal070125 at gmail.com <mailto:nirmal070125 at gmail.com>
>         <mailto:nirmal070125 at gmail.com <mailto:nirmal070125 at gmail.com>>
>         <mailto:nirmal070125 at gmail.com <mailto:nirmal070125 at gmail.com>
>         <mailto:nirmal070125 at gmail.com <mailto:nirmal070125 at gmail.com>>>
>
>         >>
>         >> IM
>         >>
>         >>
>         >>
>         >> nirmal070125 (Google Talk)
>         >>
>         >> nirmalfdo (IRC – freenode.net <http://freenode.net>
>         <http://freenode.net>
>         <http://freenode.net>)
>         >>
>         >> Address
>         >>
>         >>
>         >>
>         >> 47, Keels Housing Scheme, Pinwatte, Panadura, Sri Lanka.
>         >>
>         >> Mobile No.
>         >>
>         >>
>         >>
>         >> +94715779733
>         >>
>         >> *Why I am interested?*
>         >>
>         >> *
>         >> *
>         >>
>         >> I have recently finished a course module on Bio Informatics and
>            have a basic
>         >> understanding about bio informatics related algorithms which
>            made me interested in this area of computer science.
>         >>
>         >> *
>         >> *
>         >>
>         >> *Why I am well-suited?*
>         >>
>         >>
>         >> I participated in GSoC 2010 for Apache Derby (RDBMS in Java)
>            project and successfully finished the project. The sounding
>         Java
>            knowledge, algorithmic knowledge on bio informatics and the
>            experiences of concurrent programming make me more
>         comfortable and
>            matching.
>         >>
>         >>
>         >> “Nirmal joined the Apache Derby community as a *Google Summer
>            of Code *student for the summer of 2010. In this role, Nirmal
>            wrote a very useful tool called PlanExporter. This tool
>         will help
>            users of the *Apache Derby *database understand and fix
>            performance issues in their data-rich applications. Nirmal fit
>            well into our open-source community, collaborating with other
>            engineers, proceeding incrementally, and seeking and taking
>         advice
>            cheerfully. Nirmal's contributions to Apache Derby are highly
>            respected.”-//*Richard Hillegas*
>         <http://www.linkedin.com/profile/view?id=31609548&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1
>         <http://www.linkedin.com/profile/view?id=31609548&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1>
>         <http://www.linkedin.com/profile/view?id=31609548&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1
>         <http://www.linkedin.com/profile/view?id=31609548&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1>>>*,
>            /Senior Software Engineer, Sun Microsystems./*
>         >>
>         >>
>         >> “Nirmal's work on the Derby PlanExporter tool as part of the
>            Google 2010 Summer of Code was clear, well-executed and
>            successful. Furthermore, every member of the Derby team
>         that I've
>            spoken to has been pleased with Nirmal's contributions to the
>            community and we look forward to having Nirmal continuing
>         to work
>            with Derby in the future.”- *Bryan Pendleton*
>         <http://www.linkedin.com/profile/view?id=492758&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1
>         <http://www.linkedin.com/profile/view?id=492758&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1>
>         <http://www.linkedin.com/profile/view?id=492758&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1
>         <http://www.linkedin.com/profile/view?id=492758&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1>>>*,
>            /Committer, Apache Derby/.*
>         >>
>         >> *
>         >> *
>         >>
>         >> *Programming Experiences and Skills*
>         >>
>         >> ·Completed the “short coding exercise” (all three goals) given
>            by the mentor.
>         >>
>         >> ·Final Year Project: SeMap is our final year project and a four
>            member one which is led by me. Objective is to develop a
>         superior
>            framework for mapping English Language Semantic Dependency
>            Relationships to sets of semantic frames with reasonable
>         accuracy
>            for complex sentences with an integrated statistical
>         linguistics
>            based artificial intelligence component to allow automatic
>            extensibility.We are working under OpenCog.org, a FOSS
>         foundation,
>            under the supervision of Dr. Ben Goertzel. Technologies: [Java,
>            Drools]
>         >>
>         >> *Contributions to Open Source world*
>         >>
>         >> *
>         >> *
>         >>
>         >> ·Implemented PlanExporter tool which allows Apache Derby users
>            to view and understand the query plan followed by the
>         optimizer.
>            Technologies: [Java, XML, XSLT, HTML and CSS] (Google Summer of
>            Code – 2010 project)
>         >>
>         >> ·Solved many issues in Apache Derby
>         https://issues.apache.org/jira/secure/IssueNavigator.jspa?
>         >>
>         >> ·Continuing to work on Apache Derby even after the summer
>         of code.
>         >>
>         >> **
>         >>
>         >> *Project Rationale*
>         >>
>         >> *
>         >> *
>         >>
>         >> The calculation of simple physicochemical properties for
>            biopolymers is an important tool in the arsenal of molecular
>            biologist. Theoretically calculated quantities like extinction
>            coefficients, isoelectric points, hydrophobicities and
>         instability
>            indices are useful guides as to how a molecule behaves in an
>            experiment. Many tools for calculating these properties exist,
>            including widely used open-source implementations in EMBOSS and
>            BioPerl, but only some are currently available in BioJava3. The
>            aim of this project is to port or produce new
>         implementations of
>            standard algorithms for a range of calculations within
>         BioJava3.
>         >>
>         >> *Project Scope *
>         >>
>         >> *
>         >> *
>         >>
>         >> Primarily focus on developing following functionalities:
>         >>
>         >>   1. Finding molecular weight of a sequence
>         >>   2. Finding extinction coefficient of a protein
>         >>   3. Finding instability index of a protein
>         >>   4. Finding aliphatic index of a protein
>         >>   5. Finding GRAVY (Grand Average of Hydropathy) value of a
>         peptide
>         >>      or a protein
>         >>   6. Finding isoelectric point of a sequence
>         >>   7. Finding number of amino acids in a protein (His, Met, Cys)
>         >>
>         >> **
>         >>
>         >> *Project Plan*
>         >>
>         >> *April 20 - May 10*
>         >>
>         >>    * Read on BioJava3 design
>         http://biojava.org/wiki/BioJava3_Design
>         >>    * Read on BioJava3 data model
>         >> http://www.biojava.org/wiki/BioJava3_Proposal
>         >>    * Get an understanding on how each BioJava3 module works and
>            their
>         >>      functionalities.
>         >>    * Find and read on algorithms to provide above mentioned
>         >>      functionalities.
>         >>    * Identify the possibility of using methods and tools in
>            BioJava3
>         >>
>         >> *May**11 - May 24*
>         >>
>         >>    * Implement functions to calculate molecular weight of a
>            sequence
>         >>      and extinction coefficient of a protein using multi
>         threads
>         >>      where it is possible.
>         >>    * Implement functional test cases using Junit.
>         >>    * Develop a high level documentation for end users.
>         >>
>         >> *May 24 - July 10*
>         >>
>         >>    * Preparing for the mid-term evaluation of the project.
>         >>
>         >> *
>         >> *
>         >>
>         >> *July 12 - August 15*
>         >>
>         >>    * Implement functions to calculate,
>         >>
>         >>          o Instability index of a protein
>         >>          o Aliphatic index of a protein
>         >>          o GRAVY (Grand Average of Hydropathy) value for a
>            peptide or
>         >>            a protein
>         >>          o Isoelectric point of a sequence
>         >>
>         >>          o number of amino acids in a protein (His, Met, Cys)
>         >>
>         >> ; using multi threads where it is possible.
>         >>
>         >>    * Implement functional test cases using Junit.
>         >>    * Update the high level documentation for end users.
>         >>
>         >> *August 16 - August 22*
>         >>
>         >>    * Wrap up the work done, and polishing up the code.
>         >>    * Creating Java-doc API
>         >>    * Preparing for the final evaluation.
>         >>
>         >> *August 26*
>         >>
>         >>    * Final evaluation deadline.
>         >>
>         >> *Project Deliverables*
>         >>
>         >> ·Java library with above mentioned functionalities.
>         >>
>         >> ·Command line executables.
>         >>
>         >> ·Java doc API of the library.
>         >>
>         >> ·Functional test cases.
>         >>
>         >> ·High level end user documentation
>         >>
>         >>
>         >> --
>         >> Best Regards,
>         >> Nirmal
>         >>
>         >> C.S.Nirmal J. Fernando
>         >> Department of Computer Science & Engineering,
>         >> Faculty of Engineering,
>         >> University of Moratuwa,
>         >> Sri Lanka.
>         >>
>         >> Blog: http://nirmalfdo.blogspot.com/
>         >>
>         >
>         >
>
>
>
>            --
>            Best Regards,
>            Nirmal
>
>            C.S.Nirmal J. Fernando
>            Department of Computer Science & Engineering,
>            Faculty of Engineering,
>            University of Moratuwa,
>            Sri Lanka.
>            Blog: http://nirmalfdo.blogspot.com/
>
>
>
>
>         -- 
>         Best Regards,
>         Nirmal
>
>         C.S.Nirmal J. Fernando
>         Department of Computer Science & Engineering,
>         Faculty of Engineering,
>         University of Moratuwa,
>         Sri Lanka.
>
>         Blog: http://nirmalfdo.blogspot.com/
>
>
>
>
>
> -- 
> Best Regards,
> Nirmal
>
> C.S.Nirmal J. Fernando
> Department of Computer Science & Engineering,
> Faculty of Engineering,
> University of Moratuwa,
> Sri Lanka.
>
> Blog: http://nirmalfdo.blogspot.com/
>




More information about the biojava-dev mailing list