[Biojava-l] Suggestion for porting GHMM library

Dhruv Sharma sharma.dhrv at gmail.com
Mon Apr 2 05:21:38 UTC 2012


Then I think I'll stick to Hmmer3 implementation as far as GSoC is
concerned. I hope the licensing issues are sorted out soon.

Thanks!


-- 
*Dhruv Sharma*
*Student
B.E.(Hons.) Computer Science
BITS, Pilani
*
*India*


On Mon, Apr 2, 2012 at 9:00 AM, Scooter Willis <HWillis at scripps.edu> wrote:

> I think HMMER implementation should be viewed as the source code of
> interest. When they went from HMMER2 to HMMER3 significant changes in what
> answer you get.
>
> On 4/1/12 8:15 PM, "Andreas Prlic" <andreas at sdsc.edu> wrote:
>
> >That could work in terms of license and would be an interesting
> >feature to have. I am still slightly concerned that the scale of the
> >project might be too big and it might be difficult to accomplish this
> >during the limited time of the project.
> >
> >Andreas
> >
> >On Sun, Apr 1, 2012 at 2:32 PM, Dhruv Sharma <sharma.dhrv at gmail.com>
> >wrote:
> >> Hi Andreas,
> >>
> >> In response to our last discussion, I would like to suggest porting
> >>General
> >> Hidden Markov Model (GHMM) library (http://ghmm.org/) from C to Java.
> >>
> >> The library is licensed under LGPL and is currently available as RC1
> >> version. The code is not very big and it is very much possible to port
> >>100%
> >> code to Java which would make it not only efficient in comparison to
> >>use of
> >> converters or JNI but also make it platform independent.
> >>
> >> Would it be possible to add this library to BioJava?
> >>
> >> If yes, I would surely like to work on it.
> >>
> >>
> >>
> >>
> >> On Sun, Apr 1, 2012 at 11:24 PM, Andreas Prlic <andreas at sdsc.edu>
> wrote:
> >>>
> >>> Hi Dhruv,
> >>>
> >>> We are quite flexible regarding the projects and what we are really
> >>> looking for are sound projects  and motivated students. As such our
> >>> project suggestions are quite open. We will interact with accepted
> >>> students from remote, so a certain degree of self-sufficiency will be
> >>> required from the side of the student.
> >>>
> >>> If you already see tons of problems coming up during your initial
> >>> assessment of the project, perhaps focus your proposal on something
> >>> smaller and more achievable. There are quite a number of interesting
> >>> algorithms out there and it does not have to be one of the ones
> >>> suggested by us.
> >>>
> >>> Andreas
> >>>
> >>>
> >>>
> >>> On Sat, Mar 31, 2012 at 1:46 PM, Dhruv Sharma <sharma.dhrv at gmail.com>
> >>> wrote:
> >>> > Hi,
> >>> >
> >>> > I am Dhruv Sharma, a senior undergraduate student pursuing
> >>>B.E.(Hons.)
> >>> > Computer Science at BITS, Pilani, India.
> >>> >
> >>> > I am very much interested in 'porting BLAST algorithm to Java' as a
> >>>GSoC
> >>> > 2012 project. I am proficient and primarily work using Java and C.
> >>>Also,
> >>> > I
> >>> > have past experience of working in C++ before migrating to Java.
> >>> > However, I
> >>> > am new to GSoC and haven't used version control in the past.
> >>> >
> >>> > My recent project was based on developing a web application in Java
> >>>for
> >>> > posting data to remote CS-BLAST web
> >>> > service<http://toolkit.tuebingen.mpg.de/cs_blast/> with
> >>> > FASTA sequence, parse and auto-filter its results using the release
> >>>date
> >>> > from RCSB PDB <http://www.rcsb.org/pdb/home/home.do> and download
> the
> >>> > PDB
> >>> > files.
> >>> >
> >>> > Since, the project aims at converting the legacy C/C++ code to Java,
> >>> > already suggested approaches on the Bio-Java page and my observations
> >>> > are:-
> >>> >
> >>> > 1)  Using C++ to Java converters for 100% conversion. I have tried
> >>> > converting the ncbi-blast-2.2.26 source code using a few freely
> >>> > available
> >>> > converters but all of them either crashed or failed to convert even
> >>> > after I
> >>> > resolved certain header file dependency issues that emerged. Most
> >>> > failures
> >>> > occurred at function calls to non-standard C++ libraries.
> >>> >
> >>> > 2)  Using JNI as an alternative solution. JNI programming would be a
> >>> > tedious task and would anyway require understanding of the purpose of
> >>> > underlying C++ code. Hence,has little advantage over rewriting the
> >>> > equivalent Java code. A significant advantage can be seen when there
> >>>is
> >>> > no
> >>> > efficient Java alternative of the C++ code. However, platform
> >>>dependence
> >>> > would still exist.
> >>> >
> >>> > According to my understanding of the problem, a hybrid approach can
> >>>be
> >>> > taken up which includes using code converters for simpler files,
> >>>manual
> >>> > coding for tricky areas and using JNI for typical C++ code involving
> >>> > non-standard libraries. But, I am still not clear about my exact
> >>>course
> >>> > of
> >>> > action.
> >>> >
> >>> > Can you please tell me if my analysis of the problem is correct?
> >>>Please
> >>> > also comment on the feasibility of my suggested approach and please
> >>>make
> >>> > any suggestions as they would help me in improving my application
> >>>draft
> >>> > that I would soon be sharing for review.
> >>> >
> >>> > As BLAST is a collection of programs, so, keeping in mind the length
> >>>of
> >>> > code to be ported, can we work on certain selectively critical
> >>>programs
> >>> > in
> >>> > it from the GSoC's perspective?
> >>> >
> >>> >
> >>> > Thanks.
> >>> >
> >>> > --
> >>> > *Dhruv Sharma*
> >>> > *Student
> >>> > B.E.(Hons.) Computer Science
> >>> > BITS, Pilani
> >>> > *
> >>> > *India*
> >>> > _______________________________________________
> >>> > Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> >>> > http://lists.open-bio.org/mailman/listinfo/biojava-l
> >>
> >>
> >>
> >>
> >> --
> >> Dhruv Sharma
> >> Student
> >> B.E.(Hons.) Computer Science
> >> BITS, Pilani
> >> India
> >>
> >
> >_______________________________________________
> >Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> >http://lists.open-bio.org/mailman/listinfo/biojava-l
>
>



More information about the Biojava-l mailing list