[Biojava-l] Suggestion for porting GHMM library

Scooter Willis HWillis at scripps.edu
Mon Apr 2 03:30:46 UTC 2012


I think HMMER implementation should be viewed as the source code of
interest. When they went from HMMER2 to HMMER3 significant changes in what
answer you get. 

On 4/1/12 8:15 PM, "Andreas Prlic" <andreas at sdsc.edu> wrote:

>That could work in terms of license and would be an interesting
>feature to have. I am still slightly concerned that the scale of the
>project might be too big and it might be difficult to accomplish this
>during the limited time of the project.
>
>Andreas
>
>On Sun, Apr 1, 2012 at 2:32 PM, Dhruv Sharma <sharma.dhrv at gmail.com>
>wrote:
>> Hi Andreas,
>>
>> In response to our last discussion, I would like to suggest porting
>>General
>> Hidden Markov Model (GHMM) library (http://ghmm.org/) from C to Java.
>>
>> The library is licensed under LGPL and is currently available as RC1
>> version. The code is not very big and it is very much possible to port
>>100%
>> code to Java which would make it not only efficient in comparison to
>>use of
>> converters or JNI but also make it platform independent.
>>
>> Would it be possible to add this library to BioJava?
>>
>> If yes, I would surely like to work on it.
>>
>>
>>
>>
>> On Sun, Apr 1, 2012 at 11:24 PM, Andreas Prlic <andreas at sdsc.edu> wrote:
>>>
>>> Hi Dhruv,
>>>
>>> We are quite flexible regarding the projects and what we are really
>>> looking for are sound projects  and motivated students. As such our
>>> project suggestions are quite open. We will interact with accepted
>>> students from remote, so a certain degree of self-sufficiency will be
>>> required from the side of the student.
>>>
>>> If you already see tons of problems coming up during your initial
>>> assessment of the project, perhaps focus your proposal on something
>>> smaller and more achievable. There are quite a number of interesting
>>> algorithms out there and it does not have to be one of the ones
>>> suggested by us.
>>>
>>> Andreas
>>>
>>>
>>>
>>> On Sat, Mar 31, 2012 at 1:46 PM, Dhruv Sharma <sharma.dhrv at gmail.com>
>>> wrote:
>>> > Hi,
>>> >
>>> > I am Dhruv Sharma, a senior undergraduate student pursuing
>>>B.E.(Hons.)
>>> > Computer Science at BITS, Pilani, India.
>>> >
>>> > I am very much interested in 'porting BLAST algorithm to Java' as a
>>>GSoC
>>> > 2012 project. I am proficient and primarily work using Java and C.
>>>Also,
>>> > I
>>> > have past experience of working in C++ before migrating to Java.
>>> > However, I
>>> > am new to GSoC and haven't used version control in the past.
>>> >
>>> > My recent project was based on developing a web application in Java
>>>for
>>> > posting data to remote CS-BLAST web
>>> > service<http://toolkit.tuebingen.mpg.de/cs_blast/> with
>>> > FASTA sequence, parse and auto-filter its results using the release
>>>date
>>> > from RCSB PDB <http://www.rcsb.org/pdb/home/home.do> and download the
>>> > PDB
>>> > files.
>>> >
>>> > Since, the project aims at converting the legacy C/C++ code to Java,
>>> > already suggested approaches on the Bio-Java page and my observations
>>> > are:-
>>> >
>>> > 1)  Using C++ to Java converters for 100% conversion. I have tried
>>> > converting the ncbi-blast-2.2.26 source code using a few freely
>>> > available
>>> > converters but all of them either crashed or failed to convert even
>>> > after I
>>> > resolved certain header file dependency issues that emerged. Most
>>> > failures
>>> > occurred at function calls to non-standard C++ libraries.
>>> >
>>> > 2)  Using JNI as an alternative solution. JNI programming would be a
>>> > tedious task and would anyway require understanding of the purpose of
>>> > underlying C++ code. Hence,has little advantage over rewriting the
>>> > equivalent Java code. A significant advantage can be seen when there
>>>is
>>> > no
>>> > efficient Java alternative of the C++ code. However, platform
>>>dependence
>>> > would still exist.
>>> >
>>> > According to my understanding of the problem, a hybrid approach can
>>>be
>>> > taken up which includes using code converters for simpler files,
>>>manual
>>> > coding for tricky areas and using JNI for typical C++ code involving
>>> > non-standard libraries. But, I am still not clear about my exact
>>>course
>>> > of
>>> > action.
>>> >
>>> > Can you please tell me if my analysis of the problem is correct?
>>>Please
>>> > also comment on the feasibility of my suggested approach and please
>>>make
>>> > any suggestions as they would help me in improving my application
>>>draft
>>> > that I would soon be sharing for review.
>>> >
>>> > As BLAST is a collection of programs, so, keeping in mind the length
>>>of
>>> > code to be ported, can we work on certain selectively critical
>>>programs
>>> > in
>>> > it from the GSoC's perspective?
>>> >
>>> >
>>> > Thanks.
>>> >
>>> > --
>>> > *Dhruv Sharma*
>>> > *Student
>>> > B.E.(Hons.) Computer Science
>>> > BITS, Pilani
>>> > *
>>> > *India*
>>> > _______________________________________________
>>> > Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>>> > http://lists.open-bio.org/mailman/listinfo/biojava-l
>>
>>
>>
>>
>> --
>> Dhruv Sharma
>> Student
>> B.E.(Hons.) Computer Science
>> BITS, Pilani
>> India
>>
>
>_______________________________________________
>Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>http://lists.open-bio.org/mailman/listinfo/biojava-l





More information about the Biojava-l mailing list