[Biojava-dev] [GSoC] Project Proposal
Peter Troshin
p.v.troshin at dundee.ac.uk
Thu Apr 7 09:45:24 UTC 2011
Hi Nirmal,
Actually I think the best way to find out what else you can calculate is
to ask the community. Explain the context (that you are GSoC student
etc..) , the project, and see if anyone suggest something what would be
a good fit for the project!
Regards,
Peter
On 06/04/2011 17:58, Nirmal Fernando wrote:
> Hi Peter,
>
> On Wed, Apr 6, 2011 at 9:28 PM, Peter Troshin
> <p.v.troshin at dundee.ac.uk <mailto:p.v.troshin at dundee.ac.uk>> wrote:
>
> Hi Nirmal,
>
> Thanks for improving your proposal.
> Yes, this seems useful although it may be a little out of scope
> for this project. I think that calculating some other useful
> property of the peptide/protein or nucleic acid would have been a
> better fit.
>
>
> I see! What you think about following calculations:
>
> * Calculate volume of an amino acid sequence
> * Calculate amino acid composition: eg: ACSGGS
> o Alanine A 16.67%
> o Cysteine C 16.67%
> o Glycine G 33.33%
> o Serine S 33.33%
> * Calculate atomic composition of a protein
> * Sequence word count: count the number of occurrences of a word
> in sequence.
> o
> Sequence word count('GCTATAACGTATATATAT','TATA') = 3
> * Count n-mers in a nucleotide sequence eg: AAGT
> o dimer counts: AA - 1, AG- 1, GT- 1 & all others 0
>
> Would these be too simple?
>
> Thanks.
>
> Regards,
> Peter
>
>
>
>
> On 06/04/2011 15:57, Nirmal Fernando wrote:
>
> Hi,
>
> In addition to the functionalities provided in my proposal, I
> would like to build a tool like http://gcua.schoedl.de/ which
> will be used to display the codon quality in codon usage
> frequency values.
>
> It would be nice to get the feedback of the community
> regarding the importance of a tool like this to BioJava3.
>
> Thanks.
>
>
> On Tue, Apr 5, 2011 at 9:34 PM, Nirmal Fernando
> <nirmal070125 at gmail.com <mailto:nirmal070125 at gmail.com>
> <mailto:nirmal070125 at gmail.com
> <mailto:nirmal070125 at gmail.com>>> wrote:
>
> Hi Peter,
>
> On Tue, Apr 5, 2011 at 9:18 PM, Peter Troshin
> <p.v.troshin at dundee.ac.uk <mailto:p.v.troshin at dundee.ac.uk>
> <mailto:p.v.troshin at dundee.ac.uk
> <mailto:p.v.troshin at dundee.ac.uk>>> wrote:
> >
> > Hi Nirmal,
> >
> > First of all thanks for the proposal it looks good.
> > However, I think that one of the benefits of my project idea is
> that it lets you implement a few other methods that are of
> interest to >you. It is a pity that you did not use this
> opportunity. I strongly encourage to use your knowledge and to
> look at the other properties >that you can implement for the
> benefit of the community. Otherwise it looks like you are not
> terribly interested in Bioinformatics!
>
> Sorry for the disappointment! This week is a bit busy week
> with me
> having few events at the University, that's why I didn't
> get much time
> to look for other methods. But I'll try my best to research and
> propose some other methods which will benefit the community.
>
> >
> > Also, I think that the best method of learning BioJava is trying
> it. So I'd put in the project plan that you will write test
> cases
> to check >out the parts of BioJava that you will be using.
> Apart
> from helping you learning it in depth it will also help to
> ensure
> that the BioJava >code behaves.
>
> Thanks for the tip! :)
>
> >
> > Regards,
> > Peter
> >
> > On 05/04/2011 14:40, Nirmal Fernando wrote:
> >>
> >> Hi All,
> >>
> >> I have prepared my GSoC proposal for BioJava [1]. I highly
> appreciate your valuable feedback.
> >>
> >> Thanks.
> >>
> >> [1]
> >>
> >>
> >> Google Summer of Code 2011 - Project Proposal
> >>
> >> Organization
> >>
> >>
> >>
> >> *Open Bioinformatics Foundation- BioJava*
> >>
> >> Project
> >>
> >>
> >>
> >> *Calculation of Physicochemical Properties of Amino Acids*
> >>
> >> Student Name
> >>
> >>
> >>
> >> C. S. Nirmal J. Fernando.
> >>
> >> E-mail
> >>
> >>
> >>
> >> nirmal070125 at gmail.com <mailto:nirmal070125 at gmail.com>
> <mailto:nirmal070125 at gmail.com <mailto:nirmal070125 at gmail.com>>
> <mailto:nirmal070125 at gmail.com <mailto:nirmal070125 at gmail.com>
> <mailto:nirmal070125 at gmail.com <mailto:nirmal070125 at gmail.com>>>
>
> >>
> >> IM
> >>
> >>
> >>
> >> nirmal070125 (Google Talk)
> >>
> >> nirmalfdo (IRC – freenode.net <http://freenode.net>
> <http://freenode.net>
> <http://freenode.net>)
> >>
> >> Address
> >>
> >>
> >>
> >> 47, Keels Housing Scheme, Pinwatte, Panadura, Sri Lanka.
> >>
> >> Mobile No.
> >>
> >>
> >>
> >> +94715779733
> >>
> >> *Why I am interested?*
> >>
> >> *
> >> *
> >>
> >> I have recently finished a course module on Bio Informatics and
> have a basic
> >> understanding about bio informatics related algorithms which
> made me interested in this area of computer science.
> >>
> >> *
> >> *
> >>
> >> *Why I am well-suited?*
> >>
> >>
> >> I participated in GSoC 2010 for Apache Derby (RDBMS in Java)
> project and successfully finished the project. The sounding
> Java
> knowledge, algorithmic knowledge on bio informatics and the
> experiences of concurrent programming make me more
> comfortable and
> matching.
> >>
> >>
> >> “Nirmal joined the Apache Derby community as a *Google Summer
> of Code *student for the summer of 2010. In this role, Nirmal
> wrote a very useful tool called PlanExporter. This tool
> will help
> users of the *Apache Derby *database understand and fix
> performance issues in their data-rich applications. Nirmal fit
> well into our open-source community, collaborating with other
> engineers, proceeding incrementally, and seeking and taking
> advice
> cheerfully. Nirmal's contributions to Apache Derby are highly
> respected.”-//*Richard Hillegas*
> <http://www.linkedin.com/profile/view?id=31609548&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1
> <http://www.linkedin.com/profile/view?id=31609548&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1>
> <http://www.linkedin.com/profile/view?id=31609548&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1
> <http://www.linkedin.com/profile/view?id=31609548&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1>>>*,
> /Senior Software Engineer, Sun Microsystems./*
> >>
> >>
> >> “Nirmal's work on the Derby PlanExporter tool as part of the
> Google 2010 Summer of Code was clear, well-executed and
> successful. Furthermore, every member of the Derby team
> that I've
> spoken to has been pleased with Nirmal's contributions to the
> community and we look forward to having Nirmal continuing
> to work
> with Derby in the future.”- *Bryan Pendleton*
> <http://www.linkedin.com/profile/view?id=492758&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1
> <http://www.linkedin.com/profile/view?id=492758&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1>
> <http://www.linkedin.com/profile/view?id=492758&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1
> <http://www.linkedin.com/profile/view?id=492758&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1>>>*,
> /Committer, Apache Derby/.*
> >>
> >> *
> >> *
> >>
> >> *Programming Experiences and Skills*
> >>
> >> ·Completed the “short coding exercise” (all three goals) given
> by the mentor.
> >>
> >> ·Final Year Project: SeMap is our final year project and a four
> member one which is led by me. Objective is to develop a
> superior
> framework for mapping English Language Semantic Dependency
> Relationships to sets of semantic frames with reasonable
> accuracy
> for complex sentences with an integrated statistical
> linguistics
> based artificial intelligence component to allow automatic
> extensibility.We are working under OpenCog.org, a FOSS
> foundation,
> under the supervision of Dr. Ben Goertzel. Technologies: [Java,
> Drools]
> >>
> >> *Contributions to Open Source world*
> >>
> >> *
> >> *
> >>
> >> ·Implemented PlanExporter tool which allows Apache Derby users
> to view and understand the query plan followed by the
> optimizer.
> Technologies: [Java, XML, XSLT, HTML and CSS] (Google Summer of
> Code – 2010 project)
> >>
> >> ·Solved many issues in Apache Derby
> https://issues.apache.org/jira/secure/IssueNavigator.jspa?
> >>
> >> ·Continuing to work on Apache Derby even after the summer
> of code.
> >>
> >> **
> >>
> >> *Project Rationale*
> >>
> >> *
> >> *
> >>
> >> The calculation of simple physicochemical properties for
> biopolymers is an important tool in the arsenal of molecular
> biologist. Theoretically calculated quantities like extinction
> coefficients, isoelectric points, hydrophobicities and
> instability
> indices are useful guides as to how a molecule behaves in an
> experiment. Many tools for calculating these properties exist,
> including widely used open-source implementations in EMBOSS and
> BioPerl, but only some are currently available in BioJava3. The
> aim of this project is to port or produce new
> implementations of
> standard algorithms for a range of calculations within
> BioJava3.
> >>
> >> *Project Scope *
> >>
> >> *
> >> *
> >>
> >> Primarily focus on developing following functionalities:
> >>
> >> 1. Finding molecular weight of a sequence
> >> 2. Finding extinction coefficient of a protein
> >> 3. Finding instability index of a protein
> >> 4. Finding aliphatic index of a protein
> >> 5. Finding GRAVY (Grand Average of Hydropathy) value of a
> peptide
> >> or a protein
> >> 6. Finding isoelectric point of a sequence
> >> 7. Finding number of amino acids in a protein (His, Met, Cys)
> >>
> >> **
> >>
> >> *Project Plan*
> >>
> >> *April 20 - May 10*
> >>
> >> * Read on BioJava3 design
> http://biojava.org/wiki/BioJava3_Design
> >> * Read on BioJava3 data model
> >> http://www.biojava.org/wiki/BioJava3_Proposal
> >> * Get an understanding on how each BioJava3 module works and
> their
> >> functionalities.
> >> * Find and read on algorithms to provide above mentioned
> >> functionalities.
> >> * Identify the possibility of using methods and tools in
> BioJava3
> >>
> >> *May**11 - May 24*
> >>
> >> * Implement functions to calculate molecular weight of a
> sequence
> >> and extinction coefficient of a protein using multi
> threads
> >> where it is possible.
> >> * Implement functional test cases using Junit.
> >> * Develop a high level documentation for end users.
> >>
> >> *May 24 - July 10*
> >>
> >> * Preparing for the mid-term evaluation of the project.
> >>
> >> *
> >> *
> >>
> >> *July 12 - August 15*
> >>
> >> * Implement functions to calculate,
> >>
> >> o Instability index of a protein
> >> o Aliphatic index of a protein
> >> o GRAVY (Grand Average of Hydropathy) value for a
> peptide or
> >> a protein
> >> o Isoelectric point of a sequence
> >>
> >> o number of amino acids in a protein (His, Met, Cys)
> >>
> >> ; using multi threads where it is possible.
> >>
> >> * Implement functional test cases using Junit.
> >> * Update the high level documentation for end users.
> >>
> >> *August 16 - August 22*
> >>
> >> * Wrap up the work done, and polishing up the code.
> >> * Creating Java-doc API
> >> * Preparing for the final evaluation.
> >>
> >> *August 26*
> >>
> >> * Final evaluation deadline.
> >>
> >> *Project Deliverables*
> >>
> >> ·Java library with above mentioned functionalities.
> >>
> >> ·Command line executables.
> >>
> >> ·Java doc API of the library.
> >>
> >> ·Functional test cases.
> >>
> >> ·High level end user documentation
> >>
> >>
> >> --
> >> Best Regards,
> >> Nirmal
> >>
> >> C.S.Nirmal J. Fernando
> >> Department of Computer Science & Engineering,
> >> Faculty of Engineering,
> >> University of Moratuwa,
> >> Sri Lanka.
> >>
> >> Blog: http://nirmalfdo.blogspot.com/
> >>
> >
> >
>
>
>
> --
> Best Regards,
> Nirmal
>
> C.S.Nirmal J. Fernando
> Department of Computer Science & Engineering,
> Faculty of Engineering,
> University of Moratuwa,
> Sri Lanka.
> Blog: http://nirmalfdo.blogspot.com/
>
>
>
>
> --
> Best Regards,
> Nirmal
>
> C.S.Nirmal J. Fernando
> Department of Computer Science & Engineering,
> Faculty of Engineering,
> University of Moratuwa,
> Sri Lanka.
>
> Blog: http://nirmalfdo.blogspot.com/
>
>
>
>
>
> --
> Best Regards,
> Nirmal
>
> C.S.Nirmal J. Fernando
> Department of Computer Science & Engineering,
> Faculty of Engineering,
> University of Moratuwa,
> Sri Lanka.
>
> Blog: http://nirmalfdo.blogspot.com/
>
More information about the biojava-dev
mailing list