[Biojava-l] Suggestion for porting GHMM library

Andreas Prlic andreas at sdsc.edu
Mon Apr 2 13:43:51 UTC 2012


I wanted to wait if we actually get a strong project proposals before
contacting the Hmmer folks. At this stage I have seen a lot of
interest, but not a single proposal being submitted for this.

Andreas



On Sun, Apr 1, 2012 at 10:21 PM, Dhruv Sharma <sharma.dhrv at gmail.com> wrote:
> Then I think I'll stick to Hmmer3 implementation as far as GSoC is
> concerned. I hope the licensing issues are sorted out soon.
>
> Thanks!
>
>
> --
> Dhruv Sharma
> Student
> B.E.(Hons.) Computer Science
> BITS, Pilani
> India
>
>
> On Mon, Apr 2, 2012 at 9:00 AM, Scooter Willis <HWillis at scripps.edu> wrote:
>>
>> I think HMMER implementation should be viewed as the source code of
>> interest. When they went from HMMER2 to HMMER3 significant changes in what
>> answer you get.
>>
>> On 4/1/12 8:15 PM, "Andreas Prlic" <andreas at sdsc.edu> wrote:
>>
>> >That could work in terms of license and would be an interesting
>> >feature to have. I am still slightly concerned that the scale of the
>> >project might be too big and it might be difficult to accomplish this
>> >during the limited time of the project.
>> >
>> >Andreas
>> >
>> >On Sun, Apr 1, 2012 at 2:32 PM, Dhruv Sharma <sharma.dhrv at gmail.com>
>> >wrote:
>> >> Hi Andreas,
>> >>
>> >> In response to our last discussion, I would like to suggest porting
>> >>General
>> >> Hidden Markov Model (GHMM) library (http://ghmm.org/) from C to Java.
>> >>
>> >> The library is licensed under LGPL and is currently available as RC1
>> >> version. The code is not very big and it is very much possible to port
>> >>100%
>> >> code to Java which would make it not only efficient in comparison to
>> >>use of
>> >> converters or JNI but also make it platform independent.
>> >>
>> >> Would it be possible to add this library to BioJava?
>> >>
>> >> If yes, I would surely like to work on it.
>> >>
>> >>
>> >>
>> >>
>> >> On Sun, Apr 1, 2012 at 11:24 PM, Andreas Prlic <andreas at sdsc.edu>
>> >> wrote:
>> >>>
>> >>> Hi Dhruv,
>> >>>
>> >>> We are quite flexible regarding the projects and what we are really
>> >>> looking for are sound projects  and motivated students. As such our
>> >>> project suggestions are quite open. We will interact with accepted
>> >>> students from remote, so a certain degree of self-sufficiency will be
>> >>> required from the side of the student.
>> >>>
>> >>> If you already see tons of problems coming up during your initial
>> >>> assessment of the project, perhaps focus your proposal on something
>> >>> smaller and more achievable. There are quite a number of interesting
>> >>> algorithms out there and it does not have to be one of the ones
>> >>> suggested by us.
>> >>>
>> >>> Andreas
>> >>>
>> >>>
>> >>>
>> >>> On Sat, Mar 31, 2012 at 1:46 PM, Dhruv Sharma <sharma.dhrv at gmail.com>
>> >>> wrote:
>> >>> > Hi,
>> >>> >
>> >>> > I am Dhruv Sharma, a senior undergraduate student pursuing
>> >>>B.E.(Hons.)
>> >>> > Computer Science at BITS, Pilani, India.
>> >>> >
>> >>> > I am very much interested in 'porting BLAST algorithm to Java' as a
>> >>>GSoC
>> >>> > 2012 project. I am proficient and primarily work using Java and C.
>> >>>Also,
>> >>> > I
>> >>> > have past experience of working in C++ before migrating to Java.
>> >>> > However, I
>> >>> > am new to GSoC and haven't used version control in the past.
>> >>> >
>> >>> > My recent project was based on developing a web application in Java
>> >>>for
>> >>> > posting data to remote CS-BLAST web
>> >>> > service<http://toolkit.tuebingen.mpg.de/cs_blast/> with
>> >>> > FASTA sequence, parse and auto-filter its results using the release
>> >>>date
>> >>> > from RCSB PDB <http://www.rcsb.org/pdb/home/home.do> and download
>> >>> > the
>> >>> > PDB
>> >>> > files.
>> >>> >
>> >>> > Since, the project aims at converting the legacy C/C++ code to Java,
>> >>> > already suggested approaches on the Bio-Java page and my
>> >>> > observations
>> >>> > are:-
>> >>> >
>> >>> > 1)  Using C++ to Java converters for 100% conversion. I have tried
>> >>> > converting the ncbi-blast-2.2.26 source code using a few freely
>> >>> > available
>> >>> > converters but all of them either crashed or failed to convert even
>> >>> > after I
>> >>> > resolved certain header file dependency issues that emerged. Most
>> >>> > failures
>> >>> > occurred at function calls to non-standard C++ libraries.
>> >>> >
>> >>> > 2)  Using JNI as an alternative solution. JNI programming would be a
>> >>> > tedious task and would anyway require understanding of the purpose
>> >>> > of
>> >>> > underlying C++ code. Hence,has little advantage over rewriting the
>> >>> > equivalent Java code. A significant advantage can be seen when there
>> >>>is
>> >>> > no
>> >>> > efficient Java alternative of the C++ code. However, platform
>> >>>dependence
>> >>> > would still exist.
>> >>> >
>> >>> > According to my understanding of the problem, a hybrid approach can
>> >>>be
>> >>> > taken up which includes using code converters for simpler files,
>> >>>manual
>> >>> > coding for tricky areas and using JNI for typical C++ code involving
>> >>> > non-standard libraries. But, I am still not clear about my exact
>> >>>course
>> >>> > of
>> >>> > action.
>> >>> >
>> >>> > Can you please tell me if my analysis of the problem is correct?
>> >>>Please
>> >>> > also comment on the feasibility of my suggested approach and please
>> >>>make
>> >>> > any suggestions as they would help me in improving my application
>> >>>draft
>> >>> > that I would soon be sharing for review.
>> >>> >
>> >>> > As BLAST is a collection of programs, so, keeping in mind the length
>> >>>of
>> >>> > code to be ported, can we work on certain selectively critical
>> >>>programs
>> >>> > in
>> >>> > it from the GSoC's perspective?
>> >>> >
>> >>> >
>> >>> > Thanks.
>> >>> >
>> >>> > --
>> >>> > *Dhruv Sharma*
>> >>> > *Student
>> >>> > B.E.(Hons.) Computer Science
>> >>> > BITS, Pilani
>> >>> > *
>> >>> > *India*
>> >>> > _______________________________________________
>> >>> > Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>> >>> > http://lists.open-bio.org/mailman/listinfo/biojava-l
>> >>
>> >>
>> >>
>> >>
>> >> --
>> >> Dhruv Sharma
>> >> Student
>> >> B.E.(Hons.) Computer Science
>> >> BITS, Pilani
>> >> India
>> >>
>> >
>> >_______________________________________________
>> >Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>> >http://lists.open-bio.org/mailman/listinfo/biojava-l
>>
>
>




More information about the Biojava-l mailing list